Beta1,3-N-acetyl-D-galactosamine transferase protein, nucleic acid encoding the same and method of examining canceration using the same

ABSTRACT

The N-acetyl-D-galactosamine transferase protein of the present invention is characterized by transferring N-acetyl-D-galactosamine to N-acetyl-D-glucosamine with β1,3 linkage, and it preferably has the amino acid sequence shown in SEQ ID NO: 2 or 4. The canceration assay according to the present invention uses a nucleic acid for measurement which hybridizes under stringent conditions to the nucleotide sequence shown in SEQ ID NO: 1 or 3 or a nucleotide sequence complementary to at least one of them.

TECHNICAL FIELD

The present invention relates to a novelβ1,3-N-acetyl-D-galactosaminyltransferase protein and a nucleic acidencoding the same, as well as a canceration assay using the same, etc.

BACKGROUND ART

Recent attention has been focused on the in vivo roles of sugar chainsand/or complex carbohydrates. For example, factors for determining bloodtypes are glycoproteins, and it is glycolipids that are involved in thefunctions of the nervous system. Thus, enzymes having the ability tosynthesize sugar chains constitute an extremely important key toanalyzing physiological activities provided by various sugar chains.

For example, N-acetyl-D-galactosamine (hereinafter also referred to as“GalNAc”) is among the components constituting glycosaminoglycans, aswell as being a sugar residue found in various sugar chain structuressuch as glycosphingolipids and mucin-type sugar chains. Thus, an enzymetransferring GalNAc will serve as an extremely important tool inanalyzing the roles of sugar chains in various tissues in vivo.

As described above, attention has been focused on the in vivo roles ofsugar chains, but it cannot be said that sufficient headway has beenmade in analyzing in vivo sugar chain synthesis. This is in part becausethe mechanism of sugar chain synthesis and the in vivo localization ofsugar synthesis have not been fully analyzed. In analyzing the mechanismof sugar chain synthesis, it is necessary to analyze glycosylationenzymes (particularly glycosyltransferases) and to analyze what kind ofsugar chains are synthesized by means of the enzymes. To this end, thereis a strong demand for searching novel glycosyltransferases andanalyzing their functions.

There are some reports of glycosyltransferases having the ability totransfer GalNAc (Non-patent Documents 1 to 4). For example, among humanGalNAc transferases, enzymes transferring GalNAc with “β1,4 linkage” areknown (Non-patent Document 1) and enzymes using “galactose” as theiracceptor substrate are known as enzymes transferring GalNAc with β1,3linkage (Non-patent Document 2) (“β1,3” or “β3” as used herein refers toa glycosidic linkage between an α-hydroxyl group at the 1-position of asugar residue in an acceptor substrate and a hydroxyl group at the3-position of a sugar residue to be transferred and linked thereto).

On the other hand, in higher organisms like humans, no enzyme is knownto transfer GalNAc with “β1,3 linkage” to “N-acetylglucosamine”(hereinafter also referred to as “GlcNAc”).

Although there is a report showing that the sugar chain structure inwhich GalNAc and GlcNAc are linked in a β1,3 fashion was confirmed insugar chains on neutral glycolipids of fly, a kind of arthropod(Non-patent Document 5), it has been believed that such a sugar chainstructure is not present in mammals, particularly in humans, to beginwith.

Patent Document 1

International Patent Publication No. WO 01/79556

Non-patent Document 1

Cancer Res. 1993 Nov. 15; 53(22):5395-400: Yamashiro S, Ruan S, FurukawaK, Tai T, Lloyd K O, Shiku H, Furukawa K. Genetic and enzymatic basisfor the differential expression of GM2 and GD2 gangliosides in humancancer cell lines.

Non-Patent Document 2

Biochim Biophys Acta. 1995 Jan. 3; 1254(1):56-65: Taga S, Tetaud C,Mangeney M, Tursz T, Wiels J. Sequential changes in glycolipidexpression during human B cell, differentiation: enzymatic bases.

Non-patent Document 3

Proc Natl Acad Sci USA. 1996 Oct. 1; 93(20):10697-702: Haslam D B,Baenziger J U. Related Articles, Links, Expression cloning of Forssmangly colipid synthetase: a novel member of the histo-blood group ABO genefamily.

Non-patent Document 4

J Biol. Chem. 1997 Sep. 19; 272(38): 23503-14: Wandall H H, Hassan H,Mirgorodskaya E, Kristensen A K, Roepstorff P. Bennett E P, Nielsen P A,Hollingsworth M A, Burchell J, Taylor-Papadimitriou J, Clausen H.Substrate specificities of three members of the human,UDP-N-acetyl-alpha-D-galactosamine: PolypeptideN-acetylgalactosaminyltransferase family, GalNAc-T1, -T2, and -T3.

Non-Patent Document 5

J. Biochem. (Tokyo) 1990 June; 107(6); 899-903: Sugita M. Inagaki F,Naito H, Hori T., Studies on glycosphingolipids in larvae of thegreen-bottle fly, Lucilia Caesar: two neutral glycosphingolipids havinglarge straight oligosaccaride chains with eight and nine sugars.

DISCLOSURE OF THE INVENTION

A problem to be solved by the present invention is to provide apolypeptide which is a mammal-derived (particularly human-derived)glycosyltransferase and which has a novel transferase activity totransfer GalNAc with β1,3 linkage to GlcNAc, as well as a nucleic acidencoding such a polypeptide, etc.

Another problem to be solved by the present invention is to provide atransformant expressing the nucleic acid in host cells, a method forproducing the encoded protein by allowing the transformant to producethe protein and then collecting the protein, and an antibody recognizingthe protein.

On the other hand, since sugar chain synthesis may be affected bycanceration, the identification and expression analysis of such aglycosylation enzyme can be expected to provide an index useful forcancer diagnosis, etc. The present invention also provides detailedprocedures and criteria useful for canceration assay or the like byanalyzing and comparing, at the tissue or cell line level, thetranscription level of such a protein which varies in correlation withcanceration or malignancy.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a diagram showing changes in the activity of the G34 enzymeprotein according to this example, plotted against the reaction time.

FIG. 2A shows the results of NMR measurement, used for analysis of thesugar chain structure synthesized by the G34 enzyme protein according tothis example.

FIG. 2B shows a partial magnified view of the NMR results in FIG. 2A.

FIG. 3 is a table summarizing NOE in NMR shown in FIG. 2. Variousconditions for the data in Table 1 are as follows: 1.08 mM, 298K, D₂O,CH₂ (high)=4.557 ppm for non-marked data, chemical shifts for datamarked with * are CH₂ (low)=4.778 ppm, phenyl(ortho)=7.265 ppm,phenyl(meta)=7.354 ppm and phenyl(para)=7.320 ppm, calculated from the1D spectrum.

FIG. 4 is a table summarizing relevant data (tentative NOE) for eachpyranose with respect to NMR shown in FIG. 2 (s: strong, m: medium, w:weak, vw: very weak, A: GlcNAc, B: GalNAc).

FIG. 5 shows a comparison of amino acid sequences between G34 enzymeprotein according to this example and known β3Gal transferases.

FIG. 6 shows a comparison of motifs involved in the β3-linking activitybetween G34 enzyme protein according to this example and various knownβ3-linking glycosyltransferases. “b3” represents a β1-3 linkage and “Gn”represents GlcNAc.

FIG. 7 is a diagram showing the pH dependence of the activity of the G34enzyme protein according to this example.

FIG. 8 is a diagram showing ion requirement for the activity of the G34enzyme protein according to this example.

FIG. 9 presents graphs showing the expression levels of the G34 enzymeprotein according to this example in human cell lines.

FIG. 10 shows amino acid sequence alignment between mouse G34 accordingto this example (upper) and human G34 (lower).

FIG. 11 shows the result of in situ hybridization performed on a mousetestis sample using the mG34 nucleic acid according to this example.

DETAILED DESCRIPTION OF THE INVENTION

To solve the problems stated above, the inventors of the presentinvention have attempted to isolate and purify a nucleic acid ofinterest, which may have high sequence identity, on the basis of thenucleotide sequence of an enzyme gene functionally similar to theintended enzyme. More specifically, first, the sequence of a knownglycosyltransferase β3 galactosyltransferase 6 (β3GalT6) was used as aquery for a BLAST search to thereby find a sequence with homology(GenBank No. AX285201). It should be noted that this nucleotide sequencewas known as the sequence of SEQ ID NO: 1006 disclosed in InternationalPublication No. WO 01/79556 (Patent Document 1 listed above), but itsactivity remained unknown.

First, the inventors of the present invention have independently clonedthe above gene by PCR, have determined its nucleotide sequence (SEQ IDNO: 1) and putative amino acid sequence (SEQ ID NO: 2), and havesucceeded in identifying a certain biological activity of a polypeptideencoded by the nucleic acid, thus completing the present invention.Moreover, when using the sequence as a query to search mouse genes, theinventors have found the nucleotide sequence of SEQ ID NO: 3 and itsputative amino acid sequence (SEQ ID NO: 4).

The gene having the nucleotide sequence of SEQ ID NO: 1 and the proteinhaving the amino acid sequence of SEQ ID NO: 2 were designated humanG34, while the gene having the nucleotide sequence of SEQ ID NO: 3 andthe protein having the amino acid sequence of SEQ ID NO: 4 weredesignated mouse G34.

According to the studies of the inventors, the above G34 protein uses anN-acetyl-D-galactosamine residue as a donor substrate and anN-acetyl-D-glucosamine residue as an acceptor substrate. As detailedlater in Example 2, the G34 protein was found to retain three motifs inits amino acid sequence, which are well conserved in the enzyme familytransferring various sugars (e.g., galactose, N-acetyl-D-glucosamine) inthe linking mode of β1,3. In light of these points, the G34 protein wasunexpectedly believed to have transferase activity to synthesize a novelsugar chain structure “GalNAc-β-1,3-GlcNAc,” for which no report hasbeen made for mammals, particularly humans. The linking mode wasactually confirmed by NMR.

Namely, the present invention relates to aβ1,3-N-acetyl-D-galactosaminyltransferase protein which transfersN-acetyl-D-galactosamine to N-acetyl-D-glucosamine with β1,3 linkage.

An enzyme protein according to a preferred embodiment of the presentinvention may have at least one or any combination of the followingproperties (a) to (c).

(a) Acceptor Substrate Specificity

When using an oligosaccharide as an acceptor substrate, the enzymeprotein shows transferase activity toward Bz-β-GlcNAc,GlcNAc-β1-4-GlcNAc-β-Bz, Gal-β1-3 (GlcNAc-β1-6) GalNAc-α-pNp,GlcNAc-β1-3 GalNAc-α-pNp and GlcNAc-β1-6GalNAc-α-pNp (“GlcNAc”represents an N-acetyl-D-glucosamine residue, “GalNAc” represents anN-acetyl-D-galactosamine residue, “Bz” represents a benzyl group, “pNp”represents a p-nitrophenyl group, and “-” represents a glycosidiclinkage. Numbers in these formulae each represent the carbon number inthe sugar ring where a glycosidic linkage is present, and “α” and “β”represent anomers of the glycosidic linkage at the 1-position of thesugar ring. An anomer whose positional relationship with CH₂OH or CH₃ atthe 5-position is trans and cis is represented by “α” and “β”,respectively).

Preferably, the enzyme protein is substantially free from transferaseactivity toward Bz-α-GlcNAc and Gal β1-3 GlcNAc-β-pNp.

(b) Reaction pH

The activity is lower in a pH range of 6.2 to 6.6 than in other pHranges.

(c) Divalent Ion Requirement

Although the above activity is enhanced at least in the presence ofMn²⁺, Co²⁺ or Mg²⁺, the Mn²⁺-induced enhancement of the activity isalmost completely eliminated in the presence of Cu²⁺.

Moreover, in a preferred embodiment of the above glycosyltransferaseprotein, the glycosyltransferase protein of the present inventioncomprises the following polypeptide (A) or (B):

(A) a polypeptide which has the amino acid sequence shown in SEQ ID NO:2 or 4; or(B) a polypeptide which has an amino acid sequence with substitution,deletion or insertion of one or more amino acids in the amino acidsequence shown in SEQ ID NO: 2 or 4 and which transfersN-acetyl-D-galactosamine to N-acetyl-D-glucosamine with β1,3 linkage.

Moreover, in a more preferred embodiment of the aboveglycosyltransferase protein, the above polypeptide (A) is aglycosyltransferase protein consisting of a polypeptide having an aminoacid sequence covering amino acids 189 to 500 shown in SEQ ID NO: 2.Likewise, in an even more preferred embodiment of the aboveglycosyltransferase protein, the above polypeptide (A) is aglycosyltransferase protein consisting of a polypeptide having an aminoacid sequence covering amino acids 36 to 500 shown in SEQ ID NO: 2.

In addition, other embodiments of the glycosyltransferase protein of thepresent invention encompass proteins consisting of polypeptides havingamino acid sequences sharing at least more than 30% identity, preferablyat least 40% identity, and more preferably at least 50% identity with anamino acid sequence covering amino acids 189 to 500 shown in SEQ ID NO:2 or amino acids 35 to 504 shown in SEQ ID NO: 4.

In another aspect, the present invention provides a nucleic acidconsisting of a nucleotide sequence encoding any one of the abovepolypeptides or a nucleotide sequence complementary thereto.

In a preferred embodiment, the nucleic acid encoding the protein of thepresent invention is a nucleic acid consisting of the nucleotidesequence shown in SEQ ID NO: 1 or 3 or a nucleotide sequencecomplementary to at least one of them. More preferably, in the case ofhuman origin, such a nucleic acid consists of a nucleotide sequencecovering nucleotides 565 to 1503 shown in SEQ ID NO: 1 or a nucleotidesequence complementary thereto, and most preferably consists of anucleotide sequence covering nucleotides 106 to 1503 shown in SEQ ID NO:1 or a nucleotide sequence complementary thereto. In the case of mouseorigin, such a nucleic acid consists of a nucleotide sequence coveringnucleotides 103 to 1512 shown in SEQ ID NO: 3 or a nucleotide sequencecomplementary thereto.

Embodiments of the above nucleic acids according to the presentinvention encompass DNA.

The present invention further provides a vector carrying any one of theabove nucleic acids and a transformant containing the vector.

In yet another aspect, the present invention provides a method forproducing a β1,3-N-acetyl-D-galactosaminyltransferase protein, whichcomprises growing the above transformant to express the aboveglycosyltransferase protein and collecting the glycosyltransferaseprotein from the grown transformant.

In yet another aspect, the present invention provides an antibodyrecognizing any one of the aboveβ1,3-N-acetyl-D-galactosaminyltransferase proteins.

On the other hand, in response to the discovery of the above G34, theinventors of the present invention have clarified that the expressionlevel of G34 mRNA is increased significantly in cancerous tissues andcell lines.

Thus, the present invention also provides a nucleic acid formeasurement, which is useful as an index of canceration or malignancyand which hybridizes under stringent conditions to the nucleotidesequence shown in SEQ ID NO: 1 or 3 or a nucleotide sequencecomplementary to at least one of them.

The nucleic acid for measurement of the present invention may typicallyconsist of a nucleotide sequence covering at least a dozen contiguousnucleotides in the nucleotide sequence shown in SEQ ID NO: 1 or 3 or anucleotide sequence complementary thereto.

In a preferred embodiment, the nucleic acid for measurement of thepresent invention encompasses a probe consisting of the nucleotidesequence shown in SEQ ID NO: 16 or a nucleotide sequence complementarythereto, as well as a primer set consisting of the following nucleotidesequences (1) or (2):

(1) a pair of the nucleotide sequences shown in SEQ ID NOs: 14 and 15;or(2) a pair of the nucleotide sequences shown in SEQ ID NOs: 17 and 18.

Also, the nucleic acid for measurement of the present invention may beused as a tumor marker.

The present invention further provides a method for assaying cancerationin a biological sample, which comprises:

(a) using any one of the above nucleic acids to measure thetranscription level of the nucleic acid in the biological sample; and(b) determining whether the measured value is significantly higher thanthat of a normal biological sample.

In a preferred embodiment, the canceration assay of the presentinvention includes cases where the measurement of the transcriptionlevel is made by hybridization or PCR targeted at the above biologicalsample and using any one of the above nucleic acids.

In a further aspect of the canceration assay of the present invention,the present invention provides a method for assaying the effectivenessof treatment in cancer therapy, which comprises using any one of theabove nucleic acids to measure the transcription level of the nucleicacid in a biological sample treated by cancer therapy, and determiningwhether the measured value is significantly lower than that obtainedbefore treatment or than that of an untreated sample.

In particular, the above biological sample may be derived from the largeintestine (colon) or lung.

MODE FOR CARRYING OUT THE INVENTION

The mode for carrying out the present invention will be described indetail below.

(1) Nucleic Acid Encoding the G34 Enzyme Protein of the PresentInvention

Based upon the above discovery, the inventors of the present inventionexpressed the G34 enzyme protein encoded by the nucleic acid, isolatedand purified the protein, and further identified its enzymatic activity.When focusing on the fact that an amino acid sequence having the desiredenzymatic activity was identified, the nucleotide sequence of SEQ ID NO:1 or 3 is one embodiment of a nucleic acid encoding the isolatedpolypeptide having the enzymatic activity. This means that the nucleicacid of the present invention encompasses all, but a limited number of,nucleic acids having degenerate nucleotide sequences capable of encodingthe same amino acid sequence for the G34 enzyme protein.

The present invention also provides a nucleic acid encoding thefull-length or a fragment of a polypeptide consisting of a novel aminoacid sequence as mentioned above. A typical nucleic acid encoding such anovel polypeptide may have the nucleotide sequence shown in SEQ ID NO: 1or 3 or a nucleotide sequence complementary to at least one of them.

The nucleic acid of the present invention also encompasses bothsingle-stranded and double-stranded DNA and their complementary RNA.Examples of DNA include naturally-occurring DNA, recombinant DNA,chemically-bound DNA, PCR-amplified DNA, and combinations thereof.However, DNA is preferred in terms of stability during vector and/ortransformant preparation.

The nucleic acid of the present invention may be prepared in thefollowing manner, by way of example.

First, the known sequence under GenBank No. AX285201 or a part thereofmay be used to perform nucleic acid amplification on a cDNA library in aroutine manner using basic procedures for genetic engineering (e.g.,hybridization, nucleic acid amplification), thereby cloning the nucleicacid of the present invention. Since the nucleic acid may be obtained,e.g., as a DNA fragment of approximately 1.5 kbp as a PCR product, thefragment may be separated using techniques for screening DNA fragmentsbased on their molecular weight (e.g., agarose gel electrophoresis) andisolated in a routine manner, e.g. using techniques for excising aspecific band.

Moreover, according to the putative amino acid sequence (SEQ ID NO: 2 or4) of the isolated nucleic acid, the nucleic acid may be estimated tohave a hydrophobic transmembrane region at its N-terminal end. Bypreparing a region of a nucleotide sequence encoding a polypeptide freefrom this transmembrane region, it is also possible to obtain thenucleic acid of the present invention that encodes a soluble form of thepolypeptide.

Based on the nucleotide sequence of the nucleic acid disclosed herein,it is easy for those skilled in the art to create appropriate primersfrom nucleotide sequences located at both ends of a nucleic acid ofinterest or a region thereof to be prepared and to use the primers thuscreated for nucleic acid amplification to amplify and prepare the regionof interest.

The above nucleic acid amplification includes, for example, reactionsrequiring thermal cycling such as polymerase chain reaction (PCR) [SaikiR. K., et al., Science, 230, 1350-1354 (1985)], ligase chain reaction(LCR) [Wu D. Y., et al., Genomics, 4, 560-569 (1989); Barringer K. J.,et al., Gene, 89, 117-122 (1990); Barany F., Proc. Natl. Acad. Sci. USA,88, 189-193 (1991)] and transcription-based amplification [Kwoh D. Y.,et al., Proc. Natl. Acad. Sci. USA, 86, 1173-1177 (1989)], as well asisothermal reactions such as strand displacement amplification (SDA)[Walker G. T., et al., Proc. Natl. Acad. Sci. USA, 89, 392-396 (1992);Walker G. T., et al., Nuc. Acids Res., 20, 1691-1696 (1992)],self-sustained sequence replication (3SR) [Guatelli J. C., Proc. Natl.Acad. Sci. USA, 87, 1874-1878 (1990)] and Qβ replicase system [Lizardiet al., BioTechnology 6, p. 1197-1202 (1988)]. It is also possible touse other reactions, e.g., nucleic acid sequence-based amplification(NASBA) through competitive amplification between a target nucleic acidand a mutated sequence, found in European Patent No. 0525882. Preferredis PCR.

The use of the nucleic acid of the present invention also enables theexpression of the intended enzyme protein or the provision of probes andantisense primers for the purpose of medical research or gene therapy,as described later.

Those skilled in the art will be able to obtain a nucleic acid as usefulas the sequence of SEQ ID NO: 1 or 3 by preparing a nucleic acidconsisting of a nucleotide sequence sharing a certain homology with thenucleotide sequence of SEQ ID NO: 1 or 3. For example, the homologousnucleic acid of the present invention encompasses nucleic acids encodingproteins which share homology with the amino acid sequence shown in SEQID NO: 2 or 4 and which have the ability to transferN-acetyl-D-galactosamine to N-acetyl-D-glucosamine with β1,3 linkage.

To identify the range of nucleic acids encoding such homologous proteinsaccording to the present invention, an identity search is performed forthe nucleic acid sequence shown in SEQ ID NO: 1 or 3 of the presentinvention, indicating that the nucleic acid sequence shares 40% identitywith the nucleic acid sequence of a known β1,4GalNAc transferase showingthe highest homology (Non-patent Document 1 listed above) and alsoshares 40% identity with the nucleic acid sequence of a known β1,3Galtransferase showing the highest homology (Non-patent Document 2 listedabove). In light of these points, a preferred nucleic acid sequenceencoding the homologous protein of the present invention typicallyshares more than 40% identity, more preferably at least 50% identity,and particularly preferably at least 60% identity with any one of theentire nucleotide sequence of SEQ ID NO: 1 or 3, preferably a partialnucleotide sequence consisting of nucleotides 106 to 1503 in SEQ ID NO:1, preferably a partial nucleotide sequence consisting of nucleotides103 to 1512 in SEQ ID NO: 3, or nucleotide sequences complementary tothese sequences.

Likewise, the nucleotide sequences shown in SEQ ID NOs: 1 and 3 share86% identity with each other. In light of this point, a preferrednucleic acid sequence encoding the homologous protein of the presentinvention can be defined as sharing at least 86%, preferably 90%identity with any one of the entire nucleotide sequence of SEQ ID NO: 1,preferably nucleotides 106 to 1503, or a nucleotide sequencecomplementary thereto.

The above percentage of identity may be determined by visual inspectionand mathematical calculation. Alternatively, the percentage of identitybetween two nucleic acid sequences may be determined by comparingsequence information using the GAP computer program, version 6.0,described by Devereux et al., Nucl. Acids Res. 12: 387, 1984 andavailable from the University of Wisconsin Genetics Computer Group(UWGCG). The preferred default parameters for the GAP program include:(1) a unary comparison matrix (containing a value of 1 for identitiesand 0 for non-identities) for nucleotides, and the weighted comparisonmatrix of Gribskov and Burgess, Nucl. Acids Res. 14:6745, 1986, asdescribed by Schwartz and Dayhoff, eds., Atlas of Protein Sequence andStructure, pp. 353-358, National Biomedical Research Foundation, 1979;(2) a penalty of 3.0 for each gap and an additional 0.10 penalty foreach symbol in each gap; and (3) no penalty for end caps. It is alsopossible to use other sequence comparison programs used by those skilledin the art.

Other nucleic acids homologous as the structural gene of the presentinvention typically include nucleic acids which hybridize understringent conditions to a nucleotide consisting of a nucleotide sequencewithin SEQ ID NO: 1 or 3, preferably a nucleotide sequence consisting ofnucleotides 106 to 1503 of SEQ ID NO: 1, preferably a nucleotidesequence consisting of nucleotides 103 to 1512 of SEQ ID NO: 3, or anucleotide sequence complementary thereto and which encode polypeptideshaving the ability to transfer N-acetyl-D-galactosamine toN-acetyl-D-glucosamine with β1,3 linkage.

As used herein, “under stringent conditions” means that a nucleic acidhybridizes under conditions of moderate or high stringency. Morespecifically, conditions of moderate stringency may readily bedetermined by those having ordinary skill in the art, e.g., depending onthe length of DNA. Primary conditions can be found in Sambrook et al.,Molecular Cloning: A Laboratory Manual, 3rd edition, Vol. 1, 7.42-7.45Cold Spring Harbor Laboratory Press, 2001 and include the use of aprewashing solution for nitrocellulose filters 5×SSC, 0.5% SDS, 1.0 mMEDTA (pH 8.0), hybridization conditions of about 50% formamide, 2×SSC to6×SSC at about 40-50° C. (or other similar hybridization solutions, suchas Stark's solution, in about 50% formamide at about 42° C.) and washingconditions of about 60° C., 0.5×SSC, 0.1% SDS. Conditions of highstringency can also be readily determined by those skilled in the art,e.g., depending on the length of DNA. In general, such conditionsinclude hybridization and/or washing at a higher temperature and/or at alower salt concentration than that required under conditions of moderatestringency and, for example, are defined as hybridization conditions asabove and with washing at about 68° C., 0.2×SSC, 0.1% SDS. Those skilledin the art will recognize that the temperature and washing solution saltconcentration can be adjusted as necessary according to factors such asthe length of nucleotide sequences.

As described above, those skilled in the art will readily determine andachieve conditions of suitably moderate or high stringency on the basisof common knowledge about hybridization conditions which are known inthe art, as well as on the empirical rule which will be obtained throughcommonly used experimental means.

(2) Vector and Transformant of the Present Invention

The present invention provides a recombinant vector carrying the abovenucleic acid. Procedures for integrating a DNA fragment of the nucleicacid into a vector (e.g., a plasmid) include those described inSambrook, J. et al., Molecular Cloning, A Laboratory Manual (3rdedition), Cold Spring Harbor Laboratory, 1.1 (2001). For convenience, acommercially available ligation kit (e.g., a product of TaKaRa ShuzoCo., Ltd., Japan) may be used.

The recombinant vector (e.g., recombinant plasmid) thus obtained may beintroduced into host cells (e.g., E. coli DH5α, TB1, LE392, or XL-LE392or XL-1Blue). Procedures for introducing the plasmid into host cellsinclude those described in Sambrook, J. et al., Molecular Cloning, ALaboratory Manual (3rd edition), Cold Spring Harbor Laboratory, 16.1(2001), exemplified by the calcium chloride method or the calciumchloride/rubidium chloride method, electroporation, electroinjection,chemical treatment (e.g., PEG treatment), and the gene gun method.

A vector which can be used may be prepared readily by linking a desiredgene to a recombination vector available in the art (e.g., plasmid DNA)in a routine manner. Specific examples of a vector to be used include,but are not limited to, E. coli-derived plasmids such as pDONR201,pBluescript, pUC18, pUC19 and pBR322.

Those skilled in the art will be able to select appropriate restrictionends to fit into the intended expression vector. The expression vectormay be selected appropriately by those skilled in the art such that thevector is suitable for host cells where the enzyme of the presentinvention is to be expressed. Moreover, the expression vector ispreferably constructed to allow regions involved in gene expression(e.g., promoter region, enhancer region and operator region) to beproperly located to ensure expression of the above nucleic acid intarget host cells, so that the nucleic acid is properly expressed.

The type of expression vector is not limited in any way as long as thevector allows expression of a desired gene in various prokaryotic and/oreukaryotic host cells and has the function of producing a desiredprotein. Preferred examples include pQE-30, pQE-60, pMAL-C2, pMAL-p2 andpSE420 for E. coli expression, pYES2 (Saccharomyces) and pPIC3.5K,pPIC9K and pAO815 (all Pichia) for yeast expression, as well aspFastBac, pBacPAK8/9, pBK283, pVL1392 and pBlueBac4.5 for insectexpression.

To construct the expression vector, a Gateway system (InvitrogenCorporation) may be used which does not require restriction treatmentand ligation operation. The Gateway system is a site-specificrecombination system which allows cloning while maintaining theorientation of PCR products and also allows subcloning of a DNA fragmentinto a properly modified expression vector. More specifically, thissystem prepares an expression clone corresponding to the intendedexpression system by creating an entry clone from a PCR product and adonor vector by the action of a site-specific recombinase BP clonase andthen transferring the PCR product to a destination vector which allowsrecombination with this clone by the action of another recombinase LRclonase. One feature of this system is that a time- and labor-consumingsubcloning step which requires treatment with restriction enzymes and/orligases can be eliminated when an entry clone is created to begin with.

The above expression vector carrying the nucleic acid of the presentinvention may be integrated into host cells to give a transformant forproducing the polypeptide of the present invention. In general, hostcells used for obtaining the transformant may be either eukaryotic cells(e.g., mammalian cells, yeast, insect cells) or prokaryotic cells (e.g.,E. coli, Bacillus subtilis). Also, cultured cells of human origin (e.g.,HeLa, 293T, SH-SY5Y) or mouse origin (e.g., Neuro2a, NIH3T3) may be usedfor this purpose. All of these host cells are known and commerciallyavailable (e.g., from Dainippon Pharmaceutical Co., Ltd., Japan), oravailable from public research institutions (e.g., RIKEN Cell Bank).Alternatively, it is also possible to use embryos, organs, tissues ornon-human individuals.

Since the nucleic acid of the present invention was found from humangenome libraries, it is believed that when eukaryotic cells are used ashost cells, the G34 enzyme protein of the present invention may haveproperties close to native proteins (e.g., embodiments whereglycosylation occurs). In light of this point, it is preferable toselect eukaryotic cells, particularly mammalian cells, as host cells.Specific examples of mammalian cells include animal cells of mouse,Xenopus laevis, rat, hamster, monkey or human origin or cultured celllines established from these cells. E. coli, yeast or insect cellsavailable for use as host cells are specifically exemplified by E. coli(e.g., DH5α, M15, JM109, BL21), yeast (e.g., INVSc1 (Saccharomyces),GS115, KM71 (both Pichia)) or insect cells (e.g., Sf21, BmN4, silkwormlarva).

In general, an expression vector can be prepared by linking at least apromoter, an initiation codon, a gene encoding a desired protein, atermination codon and a terminator region to an appropriate replicableunit to give a continuous loop. In this case, if desired, it is alsopossible to use an appropriate DNA fragment (e.g., linkers, otherrestriction enzyme sites) through routine techniques such as digestionwith a restriction enzyme and/or ligation using T4 DNA ligase. Whenbacterial (particularly E. coli) cells are used as host cells, anexpression vector is generally composed of at least a promoter/operatorregion, an initiation codon, a gene encoding a desired protein, atermination codon, a terminator and a replicable unit. When yeast cells,plant cells, animal cells or insect cells are used as host cells, it isgenerally preferred that an expression vector comprises at least apromoter, an initiation codon, a gene encoding a desired protein, atermination codon and a terminator. In this case, the vector may alsocomprise DNA encoding a signal peptide, an enhancer sequence, 5′- and3′-terminal untranslated regions of the desired gene, a selective markerregion or a replicable unit, as appropriate.

A replicable unit refers to DNA having the ability to replicate itsentire DNA sequence in host cells and includes a native plasmid, anartificially modified plasmid (i.e., a plasmid prepared from a nativeplasmid) and a synthetic plasmid. Examples of a preferred plasmidinclude plasmid pQE30, pET or pCAL or an artificially modified productthereof (i.e., a DNA fragment obtained from pQE30, pET or pCAL bytreatment with an appropriate restriction enzyme) for E. coli cells,plasmid pYES2 or pPIC9K for yeast cells, as well as plasmid pBacPAK8/9for insect cells.

A methionine codon (ATG) may be given as an example of an initiationcodon preferred for the vector of the present invention. Examples of atermination codon include commonly used termination codons (e.g., TAG,TGA, TAA). As for enhancer and terminator sequences, it is also possibleto use those commonly used by those skilled in the art, such asSV40-derived enhancer and terminator sequences.

As a selective marker, a commonly used one can be used in a routinemanner. Examples include antibiotic resistance genes such as thoseresistant to tetracycline, ampicillin, or kanamycin or neomycin,hygromycin or spectinomycin.

The introduction (also referred to as transformation or transfection) ofthe expression vector according to the present invention into host cellsmay be accomplished by using conventionally known techniques.Transformation may be accomplished, for example, by the method of Cohenet al. [Proc. Natl. Acad. Sci. USA, 69, 2110 (1972)], the protoplastmethod [Mol. Gen. Genet., 168, 111 (1979)] or the competent method [J.Mol. Biol., 56, 209 (1971)] for bacterial cells (e.g., E. coli, Bacillussubtilis) and by the method of Hinnen et al. [Proc. Natl. Acad. Sci.USA, 75, 1927 (1978)] or the lithium method [J. B. Bacteriol., 153, 163(1983)] for Saccharomyces cerevisiae. Transformation may also beaccomplished, for example, by the leaf disk method [Science, 227, 129(1985)] or electroporation [Nature, 319, 791 (1986)] for plant cells, bythe method of Graham et al. [Virology, 52, 456 (1973)] for animal cells,and by the method of Summer et al. [Mol. Cell. Biol., 3, 2156-2165(1983)] for insect cells.

(3) G34 Enzyme Protein of the Present Invention

As illustrated in the Example section described later, a polypeptidehaving a novel enzymatic activity can be isolated and purified, forexample, by integrating a nucleic acid having the nucleotide sequence ofSEQ ID NO: 1 or 3 into an expression vector and then expressing thenucleic acid.

First, in light of the above point, a typical embodiment of the proteinof the present invention is an isolated G34 enzyme protein consisting ofthe putative amino acid sequence shown in SEQ ID NO: 2 or 4. Morespecifically, this enzyme protein has the activities shown below.

Catalytic Reaction

The enzyme protein allows transfer of “N-acetyl-D-galactosamine(GalNAc)” from its donor substrate to an acceptor substrate containing“N-acetyl-D-glucosamine (GlcNAc).” Examination of motif sequences in theamino acid sequence indicates that the linking mode betweenN-acetylgalactosamine and N-acetylglucosamine is a β1,3 glycosidiclinkage (see Example 2).

Donor Substrate Specificity:

The above N-acetyl-D-galactosamine donor substrate encompasses sugarnucleotides having N-acetylgalactosamine, such as uridinediphosphate-N-acetylgalactosamine (UDP-GalNAc), adenosinediphosphate-N-galactosamine (ADP-GalNAc), guanosinediphosphate-N-acetylgalactosamine (GDP-GalNAc) and cytidinediphosphate-N-acetylgalactosamine (CDP-GalNAc). A typical donorsubstrate is UDP-GalNAc.

Namely, the G34 enzyme protein of the present invention catalyzes areaction of the following scheme:

UDP-GalNAc+GlcNAc-R→UDP+GalNAc-β1,3-GlcNAc-R (wherein R represents,e.g., a glycoprotein, glycolipid, oligosaccharide or polysaccharidehaving the GlcNAc residue).

Acceptor Substrate Specificity:

An acceptor substrate of the above GalNAc is N-acetyl-D-glucosamine,typically an N-acetyl-D-glucosamine residue of glycoproteins,glycolipids, oligosaccharides or polysaccharides, etc.

When using an oligosaccharide as an acceptor substrate, the human G34protein obtained in Example 1 described later (typically having a regioncovering amino acid 36 to the C-terminal end of SEQ ID NO: 2) showstransferase activity toward Bz-β-GlcNAc, GlcNAc-β1-4-GlcNAc-β-Bz,pNp-core2 (core2=Gal-β1-3-(GlcNAc-β1-6) GalNAc-α-pNp; the same applyinghereinafter), pNp-core3 (core3=GlcNAc-β1-3 GalNAc-α-pNp; the sameapplying hereinafter) and pNp-core6 (core6=GlcNAc-β1-6-GalNAc-α-pNp; thesame applying hereinafter). Preferably, the human G34 protein is freefrom transferase activity toward Bz-α-GlcNAc and Gal-β1-3 GlcNAc-β-pNp.Moreover, when the activity is compared between these substrates, thetransferase activity is very high in transferring to pNp-core2 andBz-β-GlcNAc, particularly highest in transferring to pNp-core2. Thetransferase activity is relatively low in transferring toGlcNAc-β1-4-GlcNAc-β-Bz, pNp-core3 and pNp-core6.

Likewise, the mouse G34 protein obtained in Example 4 described later(typically having an active region covering amino acid 35 to theC-terminal end of SEQ ID NO: 4) shows transferase activity towardBz-β-GlcNAc, pNp-β-Glc, GlcNAc-β1-4-GlcNAc-β-Bz, pNp-core2, pNp-core3and pNp-core6. When the activity is compared between these substrates,the transferase activity is highest in transferring to Bz-β-GlcNAc,followed by core2-pNp, core6-pNp, core3-pNp, pNp-β-Glc andGlcNAc-β1-4-GlcNAc-β-Bz in the order named.

As used herein, “GlcNAc” represents an N-acetyl-D-glucosamine residue,“GalNAc” represents an N-acetyl-D-galactosamine residue, “Glc”represents a glucosamine residue, “Bz” represents a benzyl group, “pNp”represents a p-nitrophenyl group, “oNp” represents a o-nitrophenylgroup, and “-” represents a glycosidic linkage. Numbers in theseformulae each represent the carbon number in the sugar ring where theabove glycosidic linkage is present. Likewise, “α” and “β” representanomers of the above glycosidic linkage at the 1-position of the sugarring. An anomer whose positional relationship with CH₂OH or CH₃ at the5-position is trans and cis is represented by “α” and “β”, respectively.

Optimum Buffer and Optimum pH (Table 3 and FIG. 4):

Examination of the human G34 protein indicates that the protein has theabove catalytic effect in each of the following optimum buffers: MES(2-morpholinoethanesulfonic acid) buffer, sodium cacodylate buffer orHEPES (N-[2-hydroxyethyl]piperazine-N′-[2-ethanesulfonic acid]) buffer.

The pH dependence of the activity in each buffer is as follows: in MESbuffer, the activity is highest around a pH of at least 5.50 to 5.78 andsecond highest around pH 6.75; in sodium cacodylate buffer, the activityincreases with decrease in pH from around 6.2 to around 5.0 and ishighest around pH 5.0, while the activity also increases in apH-dependent manner between around pH 6.2 and 7.0 and nearly plateausaround pH 7.4; and in HEPES buffer, the activity is highest around a pHof 7.4 to 7.5. Among them, HEPES buffer at a pH of about 7.4 to about7.5 results in the strongest activity. In all the buffers, the activityis lower in a pH range of 6.2 to 6.6 than in other pH ranges.

Divalent Ion Requirement (Table 4 and FIG. 5):

The activity of the human G34 protein is enhanced in the presence of adivalent metal ion, particularly Mn²⁺, Co²⁺ or Mg²⁺. The influence ofeach metal ion concentration on the activity is as follows: in the caseof Mn²⁺ and Co²⁺, the activity increases in a concentration-dependentmanner up to around 5.0 nM and then nearly plateaus at higherconcentrations, while in the case of Mg²⁺, the activity increases in aconcentration-dependent manner up to around 2.5 nM and then nearlyplateaus at higher concentrations. However, the Mn²⁺-induced enhancementof the activity is completely eliminated in the presence of Cu²⁺.

As described above, the G34 enzyme protein of the present invention cantransfer a GalNAc residue to a GlcNAc residue with β1-3 glycosidiclinkage under given enzymatic reaction conditions as mentioned above andis useful for such sugar chain synthesis or modification reactionstargeted at glycoproteins, glycolipids, oligosaccharides orpolysaccharides, etc.

Secondly, having disclosed herein the amino acid sequences shown in SEQID NOs: 2 and 4 which are given as typical examples of the primarystructure of the above enzyme protein, the present invention providesall proteins which can be produced on the basis of these amino acidsequences through genetic engineering procedures well known in the art(hereinafter also referred to as “mutated proteins” or “modifiedproteins”). Namely, according to common knowledge in the art, the enzymeprotein of the present invention is not limited only to a proteinconsisting of the amino acid sequence of SEQ ID NO: 2 or 4 estimatedfrom the nucleotide sequence of each cloned nucleic acid, and is alsointended to include, for example, a protein consisting of anon-full-length polypeptide having, e.g., a partial N-terminal deletionof the amino acid sequence, or a protein homologous to such an aminoacid sequence, each of which has properties inherent to the protein, asillustrated below.

First, the human G34 enzyme protein of the present invention maypreferably have an amino acid sequence covering amino acid 189 to theC-terminal end of SEQ ID NO: 2, more preferably an amino acid sequencecovering amino acid 36 to the C-terminal end as obtained in the Examplesection described later. Likewise, the mouse G34 enzyme protein of thepresent invention may preferably have an amino acid sequence coveringamino acid 35 to the C-terminal end of SEQ ID NO: 4.

Moreover, in proteins usually having physiological activities equivalentto enzymes, it is well known that the physiological activities aremaintained even when their amino acid sequences have substitution,deletion, insertion or addition of one or more amino acids. It is alsoknown that among naturally-occurring proteins, there are mutatedproteins which have gene mutations resulting from differences in thespecies of source organisms and/or differences in ecotype or which haveone or more amino acid mutations resulting from the presence of closelyresembling isozymes, etc. In light of this point, the protein of thepresent invention also encompasses mutated proteins which have an aminoacid sequence with substitution, deletion, insertion or addition of oneor more amino acids in each amino acid sequence shown in SEQ ID NO: 2 or4 and which have the ability to transfer a GalNAc residue to a GlcNAcresidue with β1-3 glycosidic linkage under given enzymatic reactionconditions as mentioned above. Moreover, particularly preferred aremodified proteins having amino acid sequences with substitution,deletion, insertion or addition of one or several amino acids in eachamino acid sequence shown in SEQ ID NO: 2 or 4.

The expression “one or more amino acids” found above means preferably 1to 200 amino acids, more preferably 1 to 100 amino acids, even morepreferably 1 to 50 amino acids, and most preferably 1 to 20 amino acids.In general, in a case where amino acid substitution occurs as a resultof site-specific mutagenesis, the number of amino acids which can besubstituted while maintaining the activities inherent to the originalprotein is preferably 1 to 10.

The modified protein of the present invention also includes thoseobtained by substitution between functionally equivalent amino acids.Namely, it is generally well known to those skilled in the art thatrecombinant proteins having a desired mutation(s) can be prepared byprocedures involving introduction of substitution between functionallyequivalent amino acids (e.g., replacement of one hydrophobic amino acidwith another hydrophobic amino acid, replacement of one hydrophilicamino acid with another hydrophilic amino acid, replacement of oneacidic amino acid with another acidic amino acid, or replacement of onebasic amino acid with another basic amino acid). The modified proteinsthus obtained often have the same properties as the original protein. Inlight of this point, modified proteins having such amino acidsubstitutions also fall within the scope of the present invention.

Moreover, the modified protein of the present invention may be aglycoprotein having sugar chains attached to the polypeptide as long asit has such an amino acid sequence as defined above and has an enzymaticactivity inherent to the intended enzyme.

To identify the range of the homologous protein of the presentinvention, an identity search using GENETYX software (GenetyxCorporation, Japan) is performed for the amino acid sequence shown inSEQ ID NO: 2 or 4 of the present invention, indicating that the aminoacid sequence shares 14% identity with a known β1,4GalNAc transferaseshowing the highest homology (Non-patent Document 1 listed above) andalso shares 30% identity with a known β1,3Gal transferase showing thehighest homology (Non-patent Document 2 listed above). In light of thesepoints, a preferred amino acid sequence for the homologous protein ofthe present invention preferably shares more than 30% identity, morepreferably at least 40% identity, and particularly preferably at least50% identity with the amino acid sequence shown in SEQ ID NO: 2 or 4.

Likewise, the amino acid sequences shown in SEQ ID NOs: 2 and 4 share88% identity with each other. In light of this point, a preferred aminoacid sequence for the homologous protein of the present invention can bedefined as sharing at least 88%, more preferably 90% identity with theamino acid sequence within SEQ ID NO: 2.

The above GENETYX is genetic information processing software for nucleicacid/protein analysis and enables standard analyses of homology andmultialignment, as well as signal peptide prediction, promoter siteprediction and secondary structure prediction. The homology analysisprogram used herein employs the Lipman-Pearson method (Lipman, D. J. &Pearson, W.R., Science, 277, 1435-1441 (1985)) frequently used as arapid and sensitive method. In the present invention, the percentage ofidentity may be determined by comparing sequence information using,e.g., the BLAST program described by Altschul et al. (Nucl. Acids. Res.,25. 3389-3402 (1997)) or the FASTA program described by Pearson et al.(Proc. Natl. Acad. Sci. USA, 2444-2448 (1988)). These programs areavailable on the Internet at the web site of the National Center forBiotechnology Information (NCBI) or the DNA Data Bank of Japan (DDBJ).The details of various conditions (parameters) for each identity searchusing each program are shown on these web sites, and default values arecommonly used for these searches although part of the settings may bechanged as appropriate. It is also possible to use other sequencecomparison programs used by those skilled in the art.

Thirdly, the isolated protein of the present invention may beadministered as an immunogen to an animal to produce an antibody againstthe protein, as described later. Such an antibody may be used forimmunoassays to measure and quantify the enzyme. Thus, the presentinvention is also useful in preparing such an immunogen. In light ofthis point, the protein of the present invention also includes apolypeptide fragment, mutant or fusion protein thereof, which containsan antigenic determinant or epitope for eliciting antibody formation.

(4) Isolation and Purification of the G34 Enzyme Protein of the PresentInvention

The enzyme protein of the present invention may be isolated and purifiedin the following manner.

Recent studies have established genetic engineering procedures whichinvolve culturing and growing a transformant and isolating and purifyinga substance of interest from the resulting culture or growntransformant. The enzyme protein of the present invention may also beexpressed (produced), e.g., by culturing in a nutrient medium atransformant containing an expression vector carrying the nucleic acidof the present invention.

A nutrient medium used for transformant culturing preferably contains acarbon source, an inorganic nitrogen source or an organic nitrogensource required for host cell (transformant) growth. Examples of acarbon source include glucose, dextran, soluble starch, sucrose andmethanol. Examples of an inorganic or organic nitrogen source includeammonium salts, nitrate salts, amino acids, corn steep liquor, peptone,casein, meat extracts, soybean meal and potato extracts. If desired, themedium may contain other nutrients such as inorganic salts (e.g., sodiumchloride, calcium chloride, sodium dihydrogen phosphate, magnesiumchloride), vitamins, and antibiotics (e.g., tetracycline, neomycin,ampicillin, kanamycin). Culturing may be accomplished in a manner knownin the art. Culture conditions such as temperature, medium pH andculture period may be appropriately selected such that the proteinaccording to the present invention is produced in a large quantity.

The enzyme protein of the present invention may be obtained from theabove culture or grown transformant as follows. Namely, in a case wherea protein of interest is accumulated in host cells, the host cells maybe collected by manipulations such as centrifugation or filtration,suspended in an appropriate buffer (e.g., Tris buffer, phosphate buffer,HEPES buffer or MES buffer at a concentration around 10 to 100 mM, thepH of which will vary from buffer to buffer, but desirably falls withinthe range of 5.0 to 9.0), and then crushed in a manner suitable for thehost cells used, followed by centrifugation to obtain the contents ofthe host cells. On the other hand, in a case where a protein of interestis secreted from host cells, the host cells and the medium are separatedfrom each other by manipulations such as centrifugation or filtration toobtain a culture filtrate. The crushed host cell solution or culturefiltrate may be provided directly or may be treated by ammonium sulfateprecipitation and dialysis before being provided for isolation andpurification of the protein.

Isolation and purification of a protein of interest may be accomplishedin the following manner. Namely, in a case where the protein is labeledwith a tag such as 6× histidine, GST or maltose-binding protein, theisolation and purification may be accomplished by affinitychromatography suitable for each of the commonly used tags. On the otherhand, in a case where the protein according to the present invention isproduced without being labeled with such a tag, the isolation andpurification may be accomplished, e.g., by ion exchange chromatography,which may further be combined with gel filtration, hydrophobicchromatography, isoelectric chromatography, etc.

Moreover, an expression vector may be constructed to facilitateisolation and purification. In particular, the isolation andpurification is facilitated if an expression vector is constructed toexpress a fusion protein of a polypeptide having an enzymatic activitywith a labeling peptide and the enzyme protein is prepared in a geneticengineering manner. An example of the above identification peptide is apeptide having the function of facilitating secretion, separation,purification or detection of the enzyme according to the presentinvention from the grown transformant by allowing the enzyme to beexpressed as a fusion protein in which the identification peptide isattached to a polypeptide having an enzymatic activity when the enzymeaccording to the present invention is prepared by gene recombinationtechniques.

Examples of such an identification peptide include peptides such as asignal peptide (a peptide composed of 15 to 30 amino acid residues,which is present at the N-terminal end of many proteins and isfunctional in cells for protein selection in the intracellular membranepermeation mechanism; e.g., OmpA, OmpT, Dsb), protein kinase A, ProteinA (a protein with a molecular weight of about 42,000, which is acomponent constituting the Staphylococcus aureus cell wall), glutathioneS transferase, His tag (a sequence consisting of 6 to 10 histidineresidues in series), myc tag (a 13 amino acid sequence derived from cMycprotein), FLAG peptide (an analysis marker composed of 8 amino acidresidues), T7 tag (composed of the first 11 amino acid residues of thegene 10 protein), S tag (composed of pancreas RNase A-derived 15 aminoacid residues), HSV tag, pelb (a 22 amino acid sequence from the E. coliexternal membrane protein pelB), HA tag (composed ofhemagglutinin-derived 10 amino acid residues), Trx tag (thioredoxinsequence), CBP tag (calmodulin-binding peptide), CBD tag(cellulose-binding domain), CBR tag (collagen-binding domain), β-lac/blu(β-lactamase), β-gal (β-galactosidase), luc (luciferase), HP-Thio(His-patch thioredoxin), HSP (heat shock peptide), Lny (lamininγ-peptide), Fn (fibronectin partial peptide), GFP (green fluorescentpeptide), YFP (yellow fluorescent peptide), CFP (cyan fluorescentpeptide), BFP (blue fluorescent peptide), DsRed, DsRed2 (red fluorescentpeptides), MBP (maltose-binding peptide), LacZ (lactose operator), IgG(immunoglobulin G), avidin and Protein G, any of which can be used.

Among them, particularly preferred are the signal peptide, proteinkinase A, Protein A, glutathione S transferase, His tag, myc tag, FLAGpeptide, T7 tag, S tag, HSV tag, pelB and HA tag because they facilitateexpression and purification of the enzyme according to the presentinvention through genetic engineering procedures. In particular, it ispreferable to obtain the enzyme as a fusion protein with FLAG peptide(Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys) because it is very easy to handle. Theabove FLAG peptide is extremely antigenic and provides an epitopecapable of reversible binding of a specific monoclonal antibody, thusenabling rapid assay and easy purification of the expressed recombinantprotein. A mouse hybridoma called 4E11 produces a monoclonal antibodywhich binds to FLAG peptide in the presence of a certain divalent metalcation, as described in U.S. Pat. No. 5,011,912 (incorporated herein byreference). A 4 μl hybridoma cell line has been deposited underAccession No. HB 9259 with the American Type Culture Collection. Themonoclonal antibody binding to FLAG peptide is available from EastmanKodak Co., Scientific Imaging Systems Division, New Haven, Conn.

pFLAG-CMV-1 (SIGMA) can be presented as an example of a basic vectorwhich can be expressed in mammalian cells and enables obtaining theenzyme protein of the present invention as a fusion protein with theabove FLAG peptide. Likewise, examples of a vector which can beexpressed in insect cells include, but are not limited to, pFBIF (i.e.,a vector prepared by integrating the region encoding FLAG peptide intopFastBac (Invitrogen Corporation); see the Example section describedlater). Those skilled in the art will be able to select an appropriatebasic vector depending on, e.g., the host cell, restriction enzyme andidentification peptide to be used for expression of the enzyme.

(5) Antibody Recognizing the G34 Enzyme Protein of the Present Invention

The present invention provides an antibody which is immunoreactive tothe G34 enzyme protein. Such an antibody is capable of specificallybinding to the enzyme protein via the antigen-binding site of theantibody (as opposed to non-specific binding). More specifically, aprotein having the amino acid sequence of SEQ ID NO: 2 or 4 or afragment, mutant or fusion protein thereof may be used as an immunogenfor producing an antibody immunoreactive to each of them.

More specifically, such a protein, fragment, mutant or fusion proteincontains an antigenic determinant or epitope for eliciting antibodyformation. These antigenic determinant and epitope may be either linearor conformational (discontinuous). The antigenic determinant or epitopecan be identified by any technique known in the art. Thus, the presentinvention also relates to an antigenic epitope of the G34 enzymeprotein. Such an epitope is useful in preparing an antibody,particularly a monoclonal antibody, as described in more detail below.

The epitope of the present invention can be used in assays and as aresearch reagent for purifying a specific binding antibody frommaterials such as polyclonal sera or supernatants from culturedhybridomas. Such an epitope or a variant thereof may be prepared usingtechniques known in the art (e.g., solid phase synthesis, chemical orenzymatic cleavage of a protein) or using recombinant DNA technology.

The enzyme protein of the present invention may be used to derive anyembodiment of an antibody. If the entire or partial polypeptide of or anepitope of the protein has been isolated, both polyclonal and monoclonalantibodies can be prepared using conventional techniques. See, e.g.,Kennet et al. (eds.), Monoclonal Antibodies, Hybridomas: A New Dimensionin Biological Analyses, Plenum Press, New York, 1980.

The present invention also provides a hybridoma cell line producing amonoclonal antibody specific to the G34 enzyme protein. Such a hybridomacan be produced and identified by conventional techniques. One methodfor producing such a hybridoma cell line involves immunizing an animalwith the enzyme protein of the present invention, collecting spleencells from the immunized animal, fusing the spleen cells with a myelomacell line to give hybridoma cells, and identifying a hybridoma cell linewhich produces a monoclonal antibody binding to the enzyme. Theresulting monoclonal antibody may be collected by conventionaltechniques.

The monoclonal antibody of the present invention encompasses chimericantibodies, for example, humanized mouse monoclonal antibodies. Such ahumanized antibody is advantageous in reducing immunogenicity whenadministered to a human subject.

The present invention also provides an antigen-binding fragment of theabove antibody. Examples of an antigen-binding fragment which can beproduced by conventional techniques include, but are not limited to, Faband F(ab′)₂ fragments. The present invention also provides an antibodyfragment and derivative which can be produced by genetic engineeringtechniques.

The antibody of the present invention can be used in assays to detectthe presence of the G34 enzyme protein of the present invention or apolypeptide fragment thereof, either in vitro or in vivo. The antibodyof the present invention may also be used in purifying the G34 enzymeprotein or a polypeptide fragment thereof by immunoaffinitychromatography.

Moreover, the antibody of the present invention may also be provided asa blocking antibody capable of blocking the binding of the aboveglycosyltransferase protein to its binding partner (e.g., acceptorsubstrate), thus inhibiting the enzyme's biological activity resultingfrom such binding. Such a blocking antibody may be identified using anysuitable assay procedure, for example, by testing the antibody for theability to inhibit the binding of the protein to certain cellsexpressing an acceptor substrate.

Alternatively, the blocking antibody may also be identified in assaysfor the ability to inhibit a biological effect resulting from the enzymeprotein bound to its binding partner in target cells. Such an antibodymay be used in an in vitro procedure or administered in vivo to inhibita biological activity mediated by the entity that generated theantibody. Thus, the present invention also provides an antibody fortreating disorders which are caused or exacerbated by either direct orindirect interaction between the G34 enzyme protein and its bindingpartner. Such therapy will involve in vivo administration of theblocking antibody to a mammal in an amount effective for inhibiting abinding partner-mediated biological activity. For use in such therapy,monoclonal antibodies are preferred and, in one embodiment, anantigen-binding antibody fragment is used.

(6) Nucleic Acid of the Present Invention for Canceration assay

In response to the discovery of the above G34 enzyme protein, theinventors of the present invention have confirmed that mRNA encodingthis protein is widely found in cancerous tissues and cell lines andthat the expression level of the mRNA is significantly increasedparticularly in cancerous tissues. Thus, the G34 nucleic acid is usefulas a tumor marker that is useful for, e.g., cancer diagnosis targeted atbiological samples containing transcription products. In this aspect,the present invention provides a nucleic acid for measurement, which iscapable of hybridizing under stringent conditions to a nucleic aciddefined by the nucleotide sequence shown in SEQ ID NO: 1 or 3.

In one embodiment, the nucleic acid for measurement of the presentinvention is a primer or probe targeting the G34 nucleic acid in abiological sample and having a nucleotide sequence selected from thenucleotide sequence of SEQ ID NO: 1 or 3. In particular, since thenucleotide sequence of SEQ ID NO: 1 is derived from mRNA encoding astructural gene and contains the entire open reading frame (ORF) of theG34 gene, full-length or nearly full-length sequences of SEQ ID NO: 1 or3 are usually found in transcription products from a biological sample.In light of this point, the primer or probe according to the presentinvention has a desired partial sequence selected from each nucleotidesequence of SEQ ID NO: 1 or 3 (either homologous or complementary to theselected sequence depending on the intended use) and hence can beprovided as a nucleic acid capable of specifically hybridizing to thetarget sequence.

Typical examples of such a primer or probe include a native DNA fragmentderived from a nucleic acid having at least a part of the nucleotidesequence shown in SEQ ID NO: 1 or 3, a DNA fragment synthesized to haveat least a part of the nucleotide sequence shown in SEQ ID NO: 1 or 3,or complementary strands of these fragments.

Such a primer or probe as mentioned above may be used to detect and/orquantify the target nucleic acid in a biological sample, as describedlater. Since sequences on the genome can also be targeted, the nucleicacid of the present invention may also be used as an antisense primerfor medical research or gene therapy.

(A) Probe of the Present Invention

In a preferred embodiment, the nucleic acid for measurement of thepresent invention is a probe targeting a nucleic acid having thenucleotide sequence of SEQ ID NO: 1 or 3 or a complementary strand of atleast one of them. The probe contains an oligonucleotide composed of atleast a dozen nucleotides, preferably at least 15 nucleotides,preferably at least 17 nucleotides, and more preferably at least 20nucleotides selected from the nucleotide sequences of SEQ ID NOs: 1 and3, or a complementary strand of the oligonucleotide, or full-length cDNAof its ORF region or a complementary strand of the cDNA.

In a case where the nucleic acid for measurement of the presentinvention is provided as an oligonucleotide probe, it is understood thata length of a dozen nucleotides (e.g., 15 nucleotides, preferably 17nucleotides) may be sufficient for the nucleic acid to specificallyhybridize under stringent conditions to its target nucleic acid. Namely,those skilled in the art will be able to select an appropriate partialsequence composed of at least 15 to 20 nucleotides from the nucleotidesequence of SEQ ID NO: 1 or 3 in accordance with known variousstrategies for oligonucleotide probe design. In this case, the aminoacid sequence information shown in SEQ ID NO: 2 or 4 is helpful inselecting a unique sequence that may be suitable as a probe.

Likewise, in the case of a cDNA probe, for example, a probe with a highmolecular weight is generally difficult to handle when used as a reagentor diagnostic agent for medical research. In light of this point, theprobe of the present invention intended for medical research includes anucleic acid composed of 50 to 500 nucleotides, more preferably 60 to300 nucleotides selected from each nucleotide sequence of SEQ ID NO: 1or 3.

The term “stringent conditions” found above means conditions of moderateor high stringency as explained earlier. Those skilled in the art willbe able to readily determine and achieve conditions of moderate or highstringency suitable for the selected probe, on the basis of commonknowledge and empirical rule about known procedures for various probedesigns and hybridization conditions.

Although depending on, e.g., the nucleotide length to be selected andthe hybridization conditions to be applied, a relatively shortoligonucleotide probe can serve as a probe even when it has a mismatchof one or several nucleotides, particularly one or two nucleotides, incomparison with the nucleotide sequence of SEQ ID NO: 1 or 3. Likewise,a relatively long cDNA probe can also serve as a probe even when it hasa mismatch of 50% or less, preferably 20% or less, in comparison withthe nucleotide sequence of SEQ ID NO: 1 or a nucleotide sequencecomplementary thereto.

The probe of the present invention thus designed can be used as alabeled probe having a label such as a fluorescent label, a radioactivelabel or a biotin label, in order to detect or confirm a hybrid formedwith a target sequence in G34.

For example, the labeled probe of the present invention may be used forconfirmation or quantification of PCR amplification products from theG34 nucleic acid. In this case, it is preferable to use a probetargeting the nucleotide sequence located in a region between a pair ofprimer sequences used for PCR. An example of such a probe may be anoligonucleotide consisting of the nucleotide sequence shown in SEQ IDNO: 16 (corresponding to a complementary strand against nucleotides 525to 556 in SEQ ID NO: 1) (see Example 3).

The probe of the present invention may be included in a kit such as adiagnostic DNA probe kit or may be immobilized on a chip such as a DNAmicroarray chip.

(B) Primers of the Present Invention

In a preferred embodiment, the primers obtained from the nucleic acidfor the canceration assay of the present invention are oligonucleotideprimers. To prepare oligonucleotide primers, two regions may be selectedfrom the ORF region of the nucleotide sequence shown in SEQ ID NO: 1 or3 in such a manner as to satisfy the following conditions:

a) the length of each region is at least several tens of nucleotides,particularly at least 15 nucleotides, preferably at least 17nucleotides, more preferably at least 20 nucleotides, and at most 50nucleotides; andb) the G+C content in each region is 40% to 70%.

In actual fact, oligonucleotide primers may be prepared assingle-stranded DNAs having nucleotide sequences identical orcomplementary to the two regions thus selected, or may be prepared assingle-stranded DNAs modified not to lose the binding specificity tothese nucleotide sequences. Although each primer of the presentinvention preferably has a sequence that is completely complementary tothe selected target sequence, a mismatch of one or two nucleotides maybe permitted.

Examples of the pair of primers according to the present inventioninclude a pair of oligonucleotides consisting of SEQ ID NOs: 14 and 15(corresponding to complementary strands against nucleotides 481-501 and562-581 in SEQ ID NO: 1, respectively) for human G34, and a pair ofoligonucleotides consisting of SEQ ID NOs: 17 and 18 (corresponding tocomplementary strands against nucleotides 481-501 and 562-581 in SEQ IDNO: 3, respectively) for mouse G34.

(7) Canceration Assay According to the Present Invention

As described earlier, the G34 nucleic acid of the present invention wasconfirmed to show a significant increase in the expression level (i.e.,transcription level of the gene from the genome into mRNA) in acancerous biological sample when compared to a normal biological sample.The G34 nucleic acid of the present invention was demonstrated to beuseful at least in a canceration assay for large intestine (colon)cancer or lung cancer (see Example 3).

According to detailed embodiments of the canceration assay of thepresent invention, transcription products extracted from a biologicalsample or a nucleic acid library derived therefrom may be used as a testsample and measured for the amount of the G34 nucleic acid (typicallythe amount of its mRNA) using the above probe or primer to determinewhether the measured value is significantly higher than that of a normalbiological sample. In this case, if the measured value of the testbiological sample is significantly higher than the reference value ofthe normal biological sample, the test biological sample is determinedas being cancerous or having a high grade of malignancy.

In the canceration assay of the present invention, the reference valuefor a normal biological sample used as a control may be a value measuredfor a control site (typically a normal site) in the same tissue of thesame patient or may be a value normalized from known data obtained in acontrol site, e.g., the mean value of mRNA levels in normal tissues.

According to the measurement of expression levels using the nucleic acidfor measurement of the present invention, human G34 is found to beexpressed at a high level in the brain, skeletal muscle, pancreas,adrenal gland, testis and prostate when measured in normal sites, andthere is also significant expression in other sites, although at arelatively low level. This indicates that human G34 expression is widelyfound over various tissues and that the expression level of human G34 issignificantly increased even in tissues with a relatively low expressionlevel, such as large intestine (colon) and lung tissues. Once these datahave been provided, those skilled in the art will recognize the actualutility and effect of the nucleic acid for measurement of the presentinvention.

In this assay, whether the measured value for a test sample issignificantly higher than that of a normal sample may be determined bythe criteria that are set depending on the accuracy (positive rate)required for the assay or the grade of malignancy to be determined. Thecriteria may be freely set depending on the intended purpose; forexample, the reference value to be determined as positive may be set toa lower value for the purpose of detecting tissues with a high grade ofmalignancy or may be set to a higher value for the purpose ofcomprehensively detecting test samples with signs or risk ofcanceration.

Examples will be given below of hybridization and PCR assays toillustrate the canceration assay of the present invention.

(A) Hybridization Assay

Embodiments of this assay include those using a probe obtained from thenucleic acid of the present invention, e.g., methods using varioushybridization assays well known to those skilled in the art, exemplifiedby Southern blotting, Northern blotting, dot blotting or colonyhybridization. In the case of requiring amplification and/orquantification of the detected signal, these methods may further becombined with immunoassay.

According to typical hybridization assays, a nucleic acid extracted froma biological sample or an amplification product thereof may beimmobilized on a solid phase and hybridized with a labeled probe understringent conditions. After washing, the label attached to the solidphase may be measured.

Extraction and purification of transcription products from a biologicalsample may be accomplished by using any method known to those skilled inthe art.

(B) PCR Assay

In a preferred embodiment, the canceration assay of the presentinvention includes PCR methods based on nucleic acid amplification usingthe primers of the present invention. The details of PCR are asexplained earlier. In this subsection, a detailed PCR-based embodimentof this assay will be explained.

G34 mRNA in transcription products to be assayed can be amplified by PCRusing a pair of primers located at both ends of a given region selectedfrom the nucleotide sequence of G34. In this step, if even trace amountsof G34 nucleic acid fragments are present in an analyte, these fragmentswill serve as templates to replicate and amplify the nucleic acid regionbetween the primer pair. After repeating a given number of PCR cycles,the nucleic acid fragments serving as templates are each amplified to adesired concentration. Under the same amplification conditions, theamplification product will be obtained in proportion to the amount ofG34 mRNA present in the analyte. Then, the above probe or the liketargeting the amplified region may be used to confirm whether theamplification product is the nucleic acid of interest and also quantifythe same. Likewise, the nucleic acid in a normal tissue may also bemeasured in the same manner. In this case, a nucleic acid of a gene thatis widely and usually present in the same tissue or the like (e.g., anucleic acid encoding glyceraldehyde-3-phosphate dehydrogenase (GAPDH)or β-actin) may be used as a control to remove variations amongindividuals. The measured value for the transcription level of G34 isprovided for comparison to assay the presence of canceration or thegrade of malignancy, as described above.

A nucleic acid sample provided for PCR methods may be either total mRNAextracted from a biological sample (e.g., a test tissue or cell) ortotal cDNA reverse transcribed from mRNA. In a case where mRNA isamplified, the NASBA method (3SR method, TMA method) using the primerpair mentioned above may be employed. Since the NASBA method per se iswell known and kits for this method are commercially available, themethod may be readily accomplished by using the primer pair of thepresent invention.

To detect or quantify the above amplification product, the reactionsolution after amplification may be electrophoresed and the resultingbands may be stained with ethidium bromide or the like, oralternatively, the electrophoresed amplification product may beimmobilized onto a solid phase (e.g., a nylon membrane), hybridized witha labeled probe specifically hybridizing to a test nucleic acid (e.g., aprobe having the nucleotide sequence of SEQ ID NO: 16) and washed,followed by detection of the label.

Examples of PCR methods preferred for this assay include quantitativePCR, especially kinetic RT-PCR or quantitative real-time PCR. Inparticular, quantitative real-time RT-PCR targeted at mRNA libraries ispreferred in view that it allows direct purification of a target to bemeasured from a biological sample and directly reflects thetranscription level. However, the nucleic acid quantification in thisassay is not limited to quantitative PCR. Other known quantitative DNAassays (e.g., Northern blotting, dot blotting, DNA microarray) using theabove probe may also be applied to the PCR products.

Moreover, when performed using a quencher fluorescent dye and a reporterfluorescent dye, quantitative RT-PCR also enables quantification of atarget nucleic acid in an analyte. In particular, it may be readilyperformed since kits for quantitative RT-PCR are commercially available.

Moreover, a target nucleic acid may also be semi-quantified based on theintensity of the corresponding electrophoretic band.

(C) Assay for Therapeutic Effect on Cancer

Other embodiments of the canceration assay of the present inventioninclude an assay for determining the effect of curing or alleviatingcancer. For example, targets of this assay include all treatments suchas administration of an anticancer agent and radiation therapy, andtargets of these treatments include in vitro cancer cells or cancertissues derived from cancer patients or experimental animal models forcarcinogenesis.

According to this assay, in a case where a biological sample issubjected to a certain treatment, it is possible to know the therapeuticeffect of the treatment on cancer by determining whether thetranscription level of the G34 nucleic acid in the biological sample isreduced due to the treatment. This assay is not limited to adetermination whether the transcription level is reduced, and the resultmay also be evaluated as effective when an increase in the transcriptionlevel is significantly prevented. The transcription level may not onlybe compared with that of an untreated tissue, but also traced over timeafter the treatment.

The assay of the present invention for therapeutic effect on cancerincludes, for example, a determination whether a candidate substance foran anticancer agent is effective for cancerous tissues, whetherresistance is developed to an anticancer agent in cancer patientsreceiving the agent, or whether a candidate substance for an anticanceragent is effective for diseased tissues or the like in experimentalanimal models. Test tissues from experimental animal models are notlimited to in vitro samples, and also include in vivo or ex vivosamples.

(8) Creation of Genetically Engineered Animal

As described earlier, the inventors of the present invention haveidentified the presence of mouse G34 and its nucleic acid sequence (SEQID NO: 3). The present invention also relates to a means for expressionand functional analysis of G34 at the animal level on the basis ofvarious gene conversion techniques using fertilized eggs or ES cells,typically relates to creating transgenic animals into which the G34 geneis introduced and knockout mice which are deficient in mouse G34, etc.

For example, the creation of knockout mice may be accomplished inaccordance with routine techniques in the art (see, e.g., NewestTechnique for Gene Targeting, edited by Takeshi Yagi, Yodosha Co., Ltd.,Japan; Gene targeting, translated and edited by Tetsuo Noda, MedicalScience International, Ltd., Japan). Namely, those skilled in the artwill be able to obtain G34 homologous recombinant ES cells in accordancewith known gene targeting techniques using sequence information of themouse G34 nucleic acid disclosed herein, thus creating G34 knockout miceusing these cells (see Example 7).

Recently, a method has been developed to prevent gene expression bysmall interfering RNA (T. R. Brummelkamp et al., Science, 296, 550-553(2002)); it is also possible to create G34 knockout mice in accordancewith such a known method.

The provision of G34 knockout mice will be helpful in elucidating theinvolvement of the G34 gene in certain vital phenomena, i.e.,information on redundancy of the gene, the relationship betweendeficiency of the gene and phenotype at the animal level (including anytype of abnormality affecting motor, mental and sensory functions), aswell as functions of the gene during the animal life cycle includingdevelopment, growth and ageing. More specifically, the knockout micethus obtained may be used to detect a carrier of sugar chainssynthesized by G34 and mG34 and to examine their relationship withphysiological functions or diseases, etc. For example, glycoproteins andglycolipids may be extracted from each tissue derived from the knockoutmice and compared with those of wild-type mice by techniques such asproteomics (e.g., two-dimensional electrophoresis, two-dimensionalthin-layer chromatography, mass spectrometry) to identify a carrier ofthe synthesized sugar chains. Moreover, the relationship withphysiological functions or diseases may be estimated by comparingphenotypes (e.g., fetal formation, growth process, spontaneous behavior)between knockout mice and wild-type mice.

Definitions of Terms

As used herein to describe the transcription level of a nucleic acid,the term “measured value” or “expression level” refers to the amount ofthe nucleic acid present in transcription products from a fixed amountof a biological sample, i.e., the concentration of the nucleic acid.Moreover, since the assay of the present invention relies on thecomparison of such measured values, even when a nucleic acid isamplified, e.g., by PCR for the purpose of quantification or even whensignals from a probe label are amplified, these amplified values mayalso be provided for relative comparison. Thus, the “measured value fora nucleic acid” can also be understood as the amount of the nucleic acidafter amplification or the signal level after amplification.

As used herein, the term “target nucleic acid” or “the nucleic acid”encompasses all types of nucleic acids, regardless of in vivo or invitro, including of course G34 mRNA, as well as those obtained using themRNA as a template. It should be noted that the term “nucleotidesequence” used herein also includes a complementary sequence thereof,unless otherwise specified.

As used herein, the term “biological sample” refers to an organ, tissueor cell, as well as an experimental animal-derived organ, tissue, cellor the like, preferably refers to a tissue or cell. Examples of such atissue include the brain, fetal brain, cerebellum, medulla oblongata,submandibular gland, thyroid gland, trachea, lung, heart, skeletalmuscle, esophagus, duodenum-, small intestine, large intestine (colon),rectum, colon, liver, fetal liver, pancreas, kidney, adrenal gland,thymus, bone marrow, spleen, testis, prostate, mammary gland, uterus andplacenta, with the large intestine (colon) and lung being morepreferred.

As used herein, the term “measure”, “measurement” or “assay” encompassesall of detection, amplification, quantification and semi-quantification.In particular, the assay according to the present invention relates to acanceration assay for a biological sample, as described above, and hencecan be applied to, e.g., cancer diagnosis and treatment in the medicalfield. The term “canceration assay” used herein includes an assay as towhether a biological sample becomes cancer, as well as an assay as towhether the grade of malignancy is high. The term “cancer” used hereintypically encompasses malignant tumors in general and also includesdisease conditions caused by the malignant tumors. Thus, targets of theassay according to the present invention include, but are notnecessarily limited to, neuroblastoma, glioma, lung cancer, esophagealcancer, gastric cancer, pancreatic cancer, liver cancer, kidney cancer,duodenal cancer, small intestine cancer, large intestine (colon) cancer,rectal cancer, colon cancer and leukemia, with large intestine (colon)cancer and lung cancer being preferred.

The present invention will now be illustrated in more detail by way ofthe following examples.

EXAMPLES Example 1 Cloning and Expression of Human G34 Gene, as Well asPurification of the Expressed Protein

β3 galactosyltransferase 6 (β3GalT6) was used as a query for a BLASTsearch to thereby find a nucleic acid sequence with homology (SEQ ID NO:1). The open reading frame (ORF) estimated from the nucleic acidsequence is composed of 1503 bp, i.e., 500 amino acids (SEQ ID NO: 2)when calculated as an amino acid sequence. The product encoded by thesenucleic acid and amino acid sequences was designated human G34.

The amino acid sequence of G34 has a hydrophobic amino acid regioncharacteristic of glycosyltransferases at its N-terminal end and sharesa homology of 47% (nucleic acid sequence) and 28% (amino acid sequence)with the above β3GalT6. The amino acid sequence of G34 also retains allof the three motifs conserved in the P3GalT family.

In this example, G34 was not only confirmed for its expression inmammalian cells, but also allowed to be expressed in insect cells forfurther examination of its activity.

For activity confirmation, it would be sufficient to express at least anactive region covering amino acid 189 to the C-terminal end of SEQ IDNO: 1, which is relatively homologous to β3GalT6. In this example,however, an active region covering amino acid 36 to the C-terminal endwas attempted to be expressed.

Confirmation of Human G34 Gene Expression in Mammalian Cells

The active region covering amino acid 36 to the C-terminal end of G34was genetically introduced into a mammalian cell line expression vectorpFLAG-CMV3 using a FLAG Protein Expression system (Sigma-AldrichCorporation).

Since pFLAG-CMV3 has a multicloning site, a gene of interest can beintroduced into pFLAG-CMV3 when the gene and pFLAG-CMV3 are treated withrestriction enzymes and then subjected to ligation reaction.

Kidney-derived cDNA (Clontech, Marathon-ready cDNA) was used as atemplate and subjected to PCR using a 5′-primer (G34-CMV-F1; SEQ ID NO:5) and a 3′-primer (G34-CMV-R1; SEQ ID NO: 6) to obtain a DNA fragmentof interest. PCR was performed under conditions of 25 cycles of 98° C.for 10 seconds, 55° C. for 30 seconds, and 72° C. for 2 minutes. The PCRproduct was then electrophoresed on an agarose gel and isolated in astandard manner after gel excision. This PCR product has restrictionenzyme sites HindIII and BamHI at the 5′ and 3′ sides, respectively.

After this DNA fragment and pFLAG-CMV3 were each treated withrestriction enzymes HindIII and BamHI, the reaction solutions were mixedtogether and subjected to ligation reaction, so that the DNA fragmentwas introduced into pFLAG-CMV3. The reaction solution was purified byethanol precipitation and then mixed with competent cells (E. coliDH5α). After heat shock treatment (42° C., 30 seconds), the cells wereseeded on ampicillin-containing LB agar medium.

On the next day, the resulting colonies were confirmed by direct PCR forthe DNA of interest. For more reliable results, after sequencing toconfirm the DNA sequence, the vector (pFLAG-CMV3-G34A) was extracted andpurified.

Human kidney cell-derived cell line 293T cells (2×10⁶) were suspended in10 ml antibiotic-free DMEM medium (Invitrogen Corporation) supplementedwith 10% fetal bovine serum, seeded in a 10 cm dish and cultured for 16hours at 37° C. in a CO₂ incubator. pFLAG-CMV3-G34A (20 ng) andLipofectamin 2000 (30 μl, Invitrogen Corporation) were each mixed with1.5 ml OPTI-MEM (Invitrogen Corporation) and incubated at roomtemperature for 5 minutes. These two solutions were further mixed gentlyand incubated at room temperature for 20 minutes. This mixed solutionwas added dropwise to the dish and cultured for 48 hours at 37° C. in aCO₂ incubator.

The supernatant (10 ml) was mixed with NaN₃ (0.05%), NaCl (150 mM),CaCl₂ (2 mM) and anti-FLAG-M1 resin (100 μl, SIGMA), followed byovernight stirring at 4° C. On the next day, the supernatant wascentrifuged (3000 rpm, 5 minutes, 4° C.) to collect a pellet fraction.After addition of 2 mM CaCl₂-TBS (900 μl), centrifugation was repeated(2000 rpm, 5 minutes, 4° C.) and the resulting pellet was suspended in200 μl of 1 mM CaCl₂-TBS for use as a sample for activity measurement(G34 enzyme solution). A part of this sample was electrophoresed bySDS-PAGE and Western blotted using anti-FLAG M2-peroxidase (SIGMA) toconfirm the expression of the G34 protein of interest.

As a result, a band was detected at a position of about 60 kDa, thusconfirming the expression of the G34 protein.

Insertion of Human G34 Gene into Insect Cell Expression Vector

The active region covering amino acid 36 to the C-terminal end of G34was integrated into pFastBac (Invitrogen Corporation) in a GATEWAYsystem (Invitrogen Corporation). Moreover, a Bac-to-Bac system(Invitrogen Corporation) was also used to construct a bacmid.

(1) Creation of entry clone

Kidney-derived cDNA (Clontech, Marathon-ready cDNA) was used as atemplate and subjected to PCR using a 5′-primer (G34-GW-F1; SEQ ID NO:7) and a 3′-primer (G34-GW-R1; SEQ ID NO: 8) to obtain a DNA fragment ofinterest. PCR was performed under conditions of 25-cycles of 98° C. for10 seconds, 55° C. for 30 seconds, and 72° C. for 2 minutes. The PCRproduct was then electrophoresed on an agarose gel and isolated in astandard manner after gel excision.

This product was integrated into pDONR201 (Invitrogen Corporation)through BP clonase reaction to create an “entry clone.” The reaction wasaccomplished by incubating the DNA fragment of interest (5 μl), pDONR201(1 μl, 150 ng), reaction buffer (2 μl) and BP clonase mix (2 μl) at 25°C. for 1 hour. The reaction was stopped by addition of proteinase K (1μl) and incubation at 37° C. for 10 minutes. The above reaction solution(1 μl) was then mixed with 100 μl competent cells (E. coli DH5α,TOYOBO). After heat shock treatment, the cells were seeded in akanamycin-containing LB plate.

On the next day, colonies were collected and confirmed by direct PCR forthe DNA of interest. For more reliable results, after sequencing toconfirm the DNA sequence, the vector (pDONR-G34A.) was extracted andpurified.

(2) Creation of Expression Clone

At both sides of the insertion site, the above entry clone has attLrecombination sites for excision of lambda phage from E. coli. When theentry clone is mixed with LR clonase (a mixture of lambda phagerecombination enzymes Int, IHF and Xis) and a destination vector, theinsertion site is transferred to the destination vector to give anexpression clone. Detailed steps are as shown below.

First, the entry clone (1 μl), pFBIF (0.5 μl, 75 ng), LR reaction buffer(2 μl), TE (4.5 μl) and LR clonase mix (2 μl) were reacted at 25° C. for1 hour. The reaction was stopped by addition of proteinase K (1 μl) andincubation at 37° C. for 10 minutes (this recombination reaction resultsin pFBIF-G34A). pFBIF is a pFastBacl vector modified to have a IgKsignal sequence (SEQ ID NO: 9) and a FLAG peptide for purification (SEQID NO: 10). The IgK signal sequence is inserted for the purpose ofconverting the expressed protein into a secretion form, while the FLAGpeptide is inserted for the purpose of purification. To insert the FLAGpeptide, a DNA fragment obtained from OT3 (SEQ ID NO: 11) as a templateusing primers OT20 (SEQ ID NO: 12) and OT21 (SEQ ID NO: 13) was insertedwith Bam H1 and Eco R1. Further, to insert a Gateway sequence, a GatewayVector Conversion system (Invitrogen Corporation) was used to introducea Conversion cassette.

Subsequently, the whole volume of the above mixed solution (11 μl) wasmixed with 100 μl competent cells (E. coli DH5α). After heat shocktreatment, the cells were seeded in an ampicillin-containing LB plate.On the next day, colonies were collected and confirmed by direct PCR forthe DNA of interest, and the vector (pFBIF-G34A) was extracted andpurified.

(3) Construction of Bacmid by Bac-to-Bac System

Next, a Bac-to-Bac system (Invitrogen Corporation) was used to causerecombination between the above pFBIF- and pFastBac, so that G34 andother sequences were inserted into a bacmid capable of growing in insectcells.

This system utilizes a Tn7 recombination site and allows a gene ofinterest to be incorporated into a bacmid through a recombinant proteinproduced from a helper plasmid when pFastBac carrying the inserted geneof interest is merely introduced into bacmid-containing E. coli(DH10BAC, Invitrogen Corporation). In addition, such a bacmid containsthe lacZ gene and allows selection based on the classical blue (notinserted)/white (inserted) colony screening.

Namely, the vector purified above (pFB1H-G34A) was mixed with 50 μlcompetent cells (E. coli DH10BAC). After heat shock treatment, the cellswere seeded in a LB plate containing kanamycin, gentamicin,tetracycline, Bluo-gal and IPTG. On the next day, white single colonieswere further cultured to collect the bacmid.

Introduction of Human G34 Gene-Containing Bacmid into Insect Cells

After confirming that the sequence of interest was inserted into thebacmid obtained from the above white colonies, this bacmid wasintroduced into insect cells (Sf21, commercially available fromInvitrogen Corporation).

Namely, Sf21 cells were added to a 35 mm dish at 9×10⁵ cells/2 mlantibiotic-containing Sf-900SFM (Invitrogen Corporation) and cultured at27° C. for 1 hour to allow cell adhesion. (Solution A) Purified bacmidDNA (5 μl) diluted with 100 μl antibiotic-free Sf-900SFM. (Solution B)CellFECTIN Reagent (6 μl, Invitrogen Corporation) diluted with 100 μlantibiotic-free Sf-900SFM. Solutions A and B were then mixed carefullyand incubated for 45 minutes at room temperature. After confirming celladhesion, the culture solution was aspirated and replaced byantibiotic-free Sf-900SFM (2 ml). The solution prepared by mixingSolutions A and B (lipid-DNA complexes) was diluted and mixed carefullywith antibiotic-free Sf900II (800 μl). The culture solution wasaspirated from the cells and replaced by the diluted solution oflipid-DNA complexes, followed by incubation at 27° C. for 5 hours. Thetransfection mixture was then removed and replaced byantibiotic-containing Sf-900SFM culture solution (2 ml), followed byincubation at 27° C. for 72 hours. At 72 hours after transfection, thecells were released by pipetting and collected together with the culturesolution, followed by centrifugation at 3000 rpm for 10 minutes. Theresulting supernatant was stored in another tube (which was used as afirst virus solution).

Sf21 cells were introduced into a T75 culture flask at 1×10⁷ cells/20 mlSf-900SFM (antibiotic-containing) and incubated at 27° C. for 1 hour.After the cells were adhered, the first virus (800 μl) was added andcultured at 27° C. for 48 hours. After 48 hours, the cells were releasedby pipetting and collected together with the culture solution, followedby centrifugation at 3000 rpm for 10 minutes. The resulting supernatantwas stored in another tube (which was used as a second virus solution).

Moreover, Sf21 cells were introduced into a T75 culture flask at 1×10⁷cells/20 ml Sf-900SFM (antibiotic-containing) and incubated at 27° C.for 1 hour. After the cells were adhered, the second virus solution (100μl) was added and cultured at 27° C. for 72 hours. After culturing, thecells were released by pipetting and collected together with the culturesolution, followed by centrifugation at 3000 rpm for 10 minutes. Theresulting supernatant was stored in another tube (which was used as athird virus solution). In addition, Sf21 cells were introduced into a100 ml spinner flask at a concentration of 6×10⁵ cells/ml in a volume of100 ml. The third virus solution (1 ml) was added and cultured at 27° C.for about 96 hours. After culturing, the cells and the culture solutionwere collected and centrifuged at 3000 rpm for 10 minutes. The resultingsupernatant was stored in another tube (which was used as a fourth virussolution).

Resin Purification of G34

The pFLAG-G34 supernatant of the above fourth virus solution (10 ml) wasmixed with NaN₃ (0.05%), NaCl (150 mM), CaCl₂ (2 mM) and anti-FLAG-M1resin (100 μl, SIGMA), followed by overnight stirring at 4° C. On thenext day, the mixture was centrifuged (3000 rpm, 5 minutes, 4° C.) tocollect a pellet fraction. After addition of 2 mM CaCl₂-TBS (900 μl),centrifugation was repeated (2000 rpm, 5 minutes, 4° C.) and theresulting pellet was suspended in 200 μl of 1 mM CaCl₂-TBS for use as asample for activity measurement (G34 enzyme solution). A part of thissample was electrophoresed by SDS-PAGE and Western blotted usinganti-FLAG M2-peroxidase (SIGMA) to confirm the expression of the G34protein of interest. As a result, a plurality of bands were detectedbroadly around a position of about 60 kDa (which would be due todifferences in post-translational modifications such as glycosylation),thus confirming the expression of the G34 protein.

Example 2 Search for Glycosyltransferase Activity of Human G34 Protein(1) Screening of GalNAc Transferase Activity

The G34 protein was examined for its substrate specificity, optimumbuffer, optimum pH and divalent ion requirement in itsβ1,3-N-acetylgalactosaminyltransferase activity.

The following reaction system was used for examining the G34 enzymeprotein for its acceptor substrate specificity in its GalNAc transferactivity.

In the reaction solutions shown below, each of the following was used at10 nmol as an acceptor substrate: pNp-α-Gal, oNp-β-Gal, Bz-α-GlcNAc,pNp-β-GlcNAc, Bz-α-GalNAc, pNp-β-GalNAc, pNp-α-Glc, pNp-β-Glc,pNp-β-GlcA, pNp-α-Fuc, pNp-α-Xyl, pNp-β-Xyl and pNp-α-Man (all purchasedfrom SIGMA), wherein “Gal” represents a D-galactose residue, “Xyl”represents a D-xylose residue, “Fuc” represents a D-fucose residue,“Man” represents a D-mannose residue and “GlcA” represents a glucuronicacid residue.

Each reaction solution was prepared as follows (final concentrations inparentheses): each substrate (10 nmol), MES (2-morpholinoethanesulfonicacid) (pH 6.5, 50 mM), MnCl₂ (10 mM), Triton X-100 (trade name) (0.1%),UDP-GalNAc (2 mM) and UDP-[¹⁴C]GlcNAc (40 nCi) were mixed andsupplemented with 5 μl G34 enzyme solution, followed by dilution withH₂O to a total volume of 20 μl (see Table 1).

TABLE 1 Composition of reaction solutions (μl) E(+), D(+) X8 E(−), D(+)E(+), D(−) Enzyme solution 5 40 0 5 140 mM HEPES 2 16 2 2 pH 7.4 100 mMUDP-GalNAc 0.5 4 0.5 0 200 mM MnCl₂ 1 8 1 1 10% Triton CF-54 0.6 4.8 0.60.6 H₂O 5.9 47.2 10.9 6.4 10 nmol/μl Acceptor 5 40 5 5 Total 20 20 20

The above reaction mixtures were each reacted at 37° C. for 16 hours.After completion of the reaction, 200 μl H₂O was added and each mixturewas lightly centrifuged to obtain the supernatant. The supernatant waspassed through a Sep-Pak plus C18 Cartridge (Waters), which had beenwashed once with 1 ml methanol and twice with 1 ml H₂O and thenequilibrated, to allow the substrate and product in the supernatant toadsorb to the cartridge. After washing the cartridge twice with 1 mlH₂O, the adsorbed substrate and product were eluted with 1 ml methanol.The eluate was mixed with 5 ml liquid scintillator ACSII (AmershamBiosciences) and measured for the amount of radiation with ascintillation counter (Beckman Coulter).

As a result, the G34 protein was identified to be GalNAc transferasehaving the ability to transfer GalNAc to pNp-β-GlcNAc. The enzymaticactivity was linearly increased at least over the course of the reactiontime between 0 and 16 hours when UDP-GlcNAc was used as a donorsubstrate and Bz-β-GlcNAc was used as an acceptor substrate (see Table 2and FIG. 1).

TABLE 2 Reaction time Area (%)  1 hour 0  2 hours 2.388  4 hours 6.19516 hours 13.719

Determination of Linking Mode

NMR was performed to analyze the linking mode of the sugar chainstructure synthesized by the G34 enzyme protein.

First, the reaction solution (final concentrations in parentheses) wasprepared by adding Bz-β-GlcNAc (640 nmol) as an acceptor substrate,HEPES buffer (pH 7.4, 14 mM), Triton CF-54 (trade name) (0.3%),UDP-GalNAc (2 mM), MnCl₂ (10 mM) and 500 μl G34 enzyme solution,followed by dilution with H₂O to a total volume of 2 ml. This reactionsolution was reacted at 37° C. for 16 hours. The reaction solution washeated for 5 minutes at 95° C. to stop the reaction and then purified byfiltration through an Ultrafree-MC (Millipore Corporation).

In one development, 50 μl of the filtrate was analyzed by highperformance liquid chromatography (HPLC) using a reversed-phase columnODS-80Ts QA (4.6×250 mm, Tosoh Corporation, Japan). The developingsolvent used was an aqueous 9% acetonitrile-0.1% trifluoroacetic acidsolution. The elution conditions were set to 1 ml/minute at 40° C.Absorbance at 210 nm was used as an index for elution peak detectionusing an SPD-10A_(vp) (Shimadzu Corporation, Japan). As a result, a newelution peak was observed, which was not detected in the control. Thispeak was separated and lyophilized for use as an NMR sample.

NMR was performed using a DMX750 (Bruker Daltonics). As a result, thesample was determined as having a β1-3 linkage between GalNAc andGlcNAc-β1-o-Bz (see FIGS. 2A and 2B). The reasons for this determinationare as follows (see FIGS. 2A and 2B, along with FIGS. 3 and 4): a) tworesidues (referred to as A and B) both have a piston coupling constantof 8.4 Hz for the signal at position 1, suggesting that two pyranosesare in β-form; b) the spin coupling constants given in FIG. 3 indicatethat A shows a spin coupling constant characteristic of glucose, while Bshows a spin coupling constant characteristic of galactose; c) it is Athat is linked to the benzyl because NOE was observed between methyleneproton of the benzyl and A1 proton; d) there are two signals resultingfrom the methyl of N-acetyl and hence both residues are identified asN-acetylated sugars; and e) NOESY indicates the presence of NOE inB1-A3.

On the other hand, examination was also performed on motif sequencesinvolved in the above enzymatic activity.

FIG. 5 shows the putative amino acid sequence of the G34 protein (SEQ IDNO: 2) compared with the amino acid sequences of various human β1-3Galtransferases (β3Gal-T1 to -T6). In FIG. 5, the boxed regions indicatethe motifs common to Gal transferases. Among them, three motifsindicated with M1 to M3 are common to β1,3-linking glycosyltransferases.In this figure, the amino acid residues indicated with * are conservedamong the compared sequences.

FIG. 6 shows a comparison of three motifs involved in the ability toform β1,3 linkages (corresponding to the M1 to M3 motifs in FIG. 5)among various β1-3GlcNAc transferases (β3Gn-T2 to -T5) and human Galtransferases T1 to T3, T5 and T6. In this figure, the amino acidresidues indicated with * are conserved among the compared sequences.

As shown in FIGS. 5 and 6, it was indicated that the amino acid sequenceof the G34 protein was conserved enough to have all the motifs (M1 toM3) involved in β1,3 linkages, upon comparison with the amino acidsequences of known various β1,3-linking glycosyltransferases.

Thus, this motif examination also supported the conclusion that the G34protein has the ability to transfer GalNAc to GlcNAc with β1,3glycosidic linkage.

Optimum Buffer and Optimum pH

The following reaction system was used for examining the optimum bufferand pH for the GalNAc transferase activity of G34. The acceptorsubstrate used was pNp-β-GlcNAc.

Any one of the following buffers was used (final concentrations inparentheses): MES (2-morpholinoethanesulfonic acid) buffer (pH 5.5,5.78, 6.0, 6.5 and 6.75, 50 mM), sodium cacodylate buffer (pH 5.0, 5.6,6.0, 6.2, 6.6, 6.8, 7.0, 7.2, 7.4 and 7.5, 25 mM) andN-[2-hydroxyethyl]piperazine-N′-[2-ethanesulfonic acid] (HEPES) buffer(pH 6.75, 7.00, 7.30, 7.40 and 7.50, 14 mM). The substrate (10 nmol),MnCl₂ (10 mM), Triton CF-54 (trade name) (0.3%), UDP-GalNAc (2 mM) andUDP-[¹⁴C]GlcNAC (40 nCi) were mixed and supplemented with 5 μl G34enzyme solution, followed by dilution with H₂O to a total volume of 20μl.

The above reaction mixtures were each reacted at 37° C. for 16 hours.After completion of the reaction, 200 μl H₂O was added and each mixturewas lightly centrifuged to obtain the supernatant. The supernatant waspassed through a Sep-Pak plus C18 Cartridge (Waters), which had beenwashed once with 1 ml methanol and twice with 1 ml H₂O and thenequilibrated, to allow the substrate and product in the supernatant toadsorb to the cartridge. After washing the cartridge twice with 1 mlH₂O, the adsorbed substrate and product were eluted with 1 ml methanol.The eluate was mixed with 5 ml liquid scintillator ACSII (AmershamBiosciences) and measured for the amount of radiation with ascintillation counter (Beckman Coulter).

As indicated by the results (see Table 3 and FIG. 7), in MES buffer, G34showed the same strong activity around pH 5.50 and pH 5.78 within theexamined range and its activity decreased in a pH-dependent manner untilpH 6.5, but became strong again at pH 6.75. In sodium cacodylate buffer,the activity was highest at pH 5.0 within the examined range and theactivity decreased in a pH-dependent manner until pH 6.2, increased in apH-dependent manner until pH 7.0, and then plateaued until pH 7.4. InHEPES buffer, the activity increased in a pH-dependent manner andreached the highest value at pH 7.4 to 7.5 within the examined range.Among them, HEPES buffer at pH 7.4 to 7.5 resulted in the strongestactivity.

TABLE 3 PH + − Sodium cacodylate 5.0 6042 204 5838 5.6 3353 159 3194 6.02689 260 2429 6.2 907 138 769 6.6 1093 136 957 6.8 2488 258 2230 7.04965 259 4706 7.2 4377 309 4068 7.4 4930 304 4626 pH + − MES 5.50 3735197 3538 5.78 3755 184 3571 6.00 2514 141 2373 6.50 1981 734 1247 6.753289 136 3153 pH + − HEPES 6.75 4894 149 4745 7.00 4912 121 4791 7.304294 127 4167 7.40 6630 120 6510 7.50 6895 240 6655

The following reaction system was used for examining the divalent ionrequirement. The acceptor substrate used was Bz-β-GlcNAc.

The reaction solution (final concentrations in parentheses) was preparedby adding the substrate (10 nmol), HEPES buffer (pH 7.4, 14 mM), TritonCF-54 (trade name) (0.3%), UDP-GalNAc (2 mM), UDP-[¹⁴C]GlcNAC (40 nCi)and 5 μl G34 enzyme solution and further adding MnCl₂, MgCl₂ or CoCl₂ at2.5 mM, 5 mM, 10 mM, 20 mM or 40 mM, followed by dilution with H₂O to atotal volume of 20 μl.

The above reaction mixture was reacted at 37° C. for 16 hours. Aftercompletion of the reaction, 200 μl H₂O was added and the mixture waslightly centrifuged to obtain the supernatant. The supernatant waspassed through a Sep-Pak plus C18 Cartridge (Waters), which had beenwashed once with 1 ml methanol and twice with 1 ml H₂O and thenequilibrated, to allow the substrate and product in the supernatant toadsorb to the cartridge. After washing the cartridge twice with 1 mlH₂O, the adsorbed substrate and product were eluted with 1 ml methanol.The eluate was mixed with 5 ml liquid scintillator ACSII (AmershamBiosciences) and measured for the amount of radiation with ascintillation counter (Beckman Coulter).

The results (see Table 4 and FIG. 8) indicated that the activity wasenhanced by the addition of each divalent ion and confirmed that the G34protein was an enzyme requiring divalent ions. Its activity nearlyplateaued at 5 nM or higher concentration of Mn or Co and at 10 nM orhigher concentration of Mg. Moreover, the Mn-induced enhancement of theactivity was completely eliminated by addition of Cu.

TABLE 4 RI assay (divalent ion requirement) Metal ion Concentration (mM)DPM Mn 2.5 7260.09 5 8270.23 10 7748.77 20 7515.86 40 4870.48 40 371.53Co 2.5 10979.99 5 9503.91 10 10979.99 20 8070.47 40 7854.92 Mg 2.54800.03 5 8692.15 10 8980.56 20 6726.32 40 5592.88 none — 2427.39 EDTA20 149.32 Mn + Cu 10 + 10 239 none — 155.64

Substrate Specificity to Oligosaccharides

The following reaction system was used for examining the acceptorsubstrate specificity to oligosaccharides. The acceptor substrates usedwere pNp-α-Gal, oNp-β-Gal, Bz-α-GlcNAc, Bz-β-GlcNAc, Bz-α-GalNAc,pNp-β-GalNAc, pNp-α-Glc, pNp-β-Glc, pNp-β-GlcA, pNp-α-Fuc, pNp-α-Xyl,pNp-β-Xyl, pNp-α-Man, lactoside-Bz, Lac-ceramide, Gal-ceramide,paragloboside, globoside, Gal-β1-4 GalNAc-α-pNp, Gal-β1-3-GlcNAc-β-pNp,GlcNAc-β1-4 GlcNAc-β-Bz, pNp-core1 (Gal-β1-3 GalNAc-α-pNp), pNp-core2(Gal-β1-3 (GlcNAc-β1-6) GalNAc-α-pNp), pNp-core3 (GlcNAc-β1-3GalNAc-α-pNp) and pNp-core6 (GlcNAc-β1-6 GalNAc-α-pNp). “Lac” representsa D-lactose residue.

Each reaction solution (final concentrations in parentheses) wasprepared by adding each substrate (50 nmol), HEPES buffer (pH 7.4, 14mM), Triton CF-54 (trade name) (0.3%), UDP-GalNAc (2 mM), MnCl₂ (10 mM),UDP-[³H]GlcNAc and 5 μl G34 enzyme solution, followed by dilution withH₂O to a total volume of 20 μl.

The above reaction mixtures were each reacted at 37° C. for 2 hours.After completion of the reaction, 200 μl H₂O was added and each mixturewas lightly centrifuged to obtain the supernatant. The supernatant waspassed through a Sep-Pak plus C18 Cartridge (Waters), which had beenwashed once with 1 ml methanol and twice with 1 ml H₂O and thenequilibrated, to allow the substrate and product in the supernatant toadsorb to the cartridge. After washing the cartridge twice with 1 mlH₂O, the adsorbed substrate and product were eluted with 1 ml methanol.The eluate was mixed with 5 ml liquid scintillator ACSII (AmershamBiosciences) and measured for the amount of radiation with ascintillation counter (Beckman Coulter).

The results thus measured were compared assuming that the radioactivityobtained using Bz-β-GlcNAc as a substrate was set to 100% (see Table 5).When used as a substrate, pNp-core2 showed the largest increase inradioactivity. Bz-β-GlcNAc, GlcNAc-β1-4-GlcNAc-β-Bz, pNp-core6 andpNp-core3 also showed increases in radioactivity in the order named. Theother substrates showed no increase in radioactivity.

TABLE 5 No. Acceptor substrate % 1 pNp-α-Gal N.D. 2 oNp-β-Gal N.D. 3Bz-α-GlcNAc N.D. 4 Bz-β-GlcNAc 100 5 Bz-α-GalNAc N.D. 6 pNp-β-GalNAcN.D. 7 pNp-α-Glc N.D. 8 pNp-β-Glc N.D. 9 pNp-β-GlcA N.D. 10 pNp-α-FucN.D. 11 pNp-α-Xyl N.D. 12 pNp-β-Xyl N.D. 13 pNp-α-Man N.D. 14Lactoside-Bz N.D. 15 Lac-ceramide N.D. 16 Gal-ceramide N.D. 17Paragloboside N.D. 18 Globoside N.D. 19 Galβ1-4GalNAc-α-pNp N.D. 20Galβ1-3GlcNAc-β-pNp N.D. 21 GlcNAcβ1-4GlcNAc-β-Bz 29 22 core1-pNp N.D.23 core2-pNp 185 24 core3-pNp 8 25 core6-pNp 19 N.D.: Not determined dueto no radioactivity core1: Gal-β1-3-GalNAc-α-pNp core2:Gal-β1-3-(GlcNAc-β1-6)GalNAc-α-pNp core3: GlcNAc-β1-3-GalNAc-α-pNpcore6: GlcNAc-β1-6-GalNAc-α-pNp

(2) Confirmation of Activity by HPLC Analysis

Using uridine diphosphate-N-acetylgalactosamine (UDP-GalNAc;Sigma-Aldrich Corporation) as a sugar residue donor substrate andBz-β-GlcNAc as a sugar residue acceptor substrate, the enzymaticactivity of G34 was analyzed by high performance liquid chromatography(HPLC).

The reaction solution (final concentrations in parentheses) was preparedby adding Bz-β-GlcNAc (10 nmol), HEPES buffer (pH 7.4, 14 mM), TritonCF-54 (trade name) (0.3%), UDP-GalNAc (2 mM), MnCl₂ (10 mM) and 10 μlG34 enzyme solution, followed by dilution with H₂O to a total volume of20 μl. This reaction solution was reacted at 37° C. for 16 hours. Thereaction was stopped by addition of H₂O (100 μl) and the reactionsolution was purified by filtration through an Ultrafree-MC (MilliporeCorporation).

The filtrate (10 μl) was analyzed by high performance liquidchromatography (HPLC) using a reversed-phase column ODS-80Ts QA (4.6×250mm, Tosoh Corporation, Japan). The developing solvent used was anaqueous 9% acetonitrile-0.1% trifluoroacetic acid solution. The elutionconditions were set to 1 ml/minute at 40° C. Absorbance at 210 nm wasused as an index for elution peak detection using an SPD-10A_(vp)(Shimadzu Corporation, Japan).

As a result, a new elution peak was observed, which was not detected inthe control.

(3) Analysis of Reaction Product by Mass Spectrometry

The above peak was collected and the reaction product was analyzed bymass spectrometry. Matrix-associated laser desorption ionization-time offlight/mass spectrometry (MALDI-TOF-MS) was performed using a Reflex IV(Bruker Daltonics). The sample at 10 μmol was dried and dissolved in 1μl distilled water for use as a MALDI-TOF-MS sample.

As a result, a peak at 538.194 m/z was observed.

This peak corresponded to the molecular weight of GalNAc-GlcNAc-Bz(sodium salt).

This result also indicated that the G34 enzyme protein transfers GalNActo Bz-β-GlcNAc.

Example 3 Measurement for mRNA Expression Level of Human G34 (1)Expression Levels in Various Human Normal Tissues

Quantitative real-time PCR was used for comparing the mRNA expressionlevels of G34 in human normal tissues. Quantitative real-time PCR is aPCR method using a sense primer and an antisense primer in combinationwith a fluorescently-labeled probe. When a gene is amplified by PCR, afluorescent label of the probe will be released to produce fluorescence.The fluorescence intensity is amplified in correlation with geneamplification and thus used as an index for quantification.

RNA of each human normal tissue (Clontech) was extracted with an RNeasyMini Kit (QIAGEN) and converted into single strand DNA by the oligo(dT)method using a Super-Script First-Strand Synthesis System (InvitrogenCorporation). This DNA was used as a template and subjected toquantitative real-time PCR in an ABI PRISM 7700 (Applied BiosystemsJapan Ltd.) using a 5′-primer (SEQ ID NO: 14), a 3′-primer (SEQ ID NO:15) and a TaqMan probe (SEQ ID NO: 16). PCR was performed underconditions of 50° C. for 2 minutes and 95° C. for 10 minutes, and thenunder conditions of 50 cycles of 95° C. for 15 seconds and 60° C. for 1minute. To prepare a calibration curve, plasmid DNA obtained byintroducing a partial sequence of G34 into pFLAG-CMV3 (InvitrogenCorporation) was used as a template and subjected to PCR as describedabove.

The results confirmed that high-level expression was observedspecifically in the testis, followed by skeletal muscle and prostate inthe order named (Table 6).

TABLE 6 G34 mRNA expression levels in human normal tissues Copy numberTissue (×10000/μg, total RNA) Standard error Brain 5.0 1.1 Fetal brain10.3 0.7 Cerebellum 2.8 0.3 Medulla oblongata 4.9 0.3 Submandibulargland 6.7 0.4 Thyroid gland 1.8 0.6 Trachea 3.9 0.3 Lung 0.4 0.1 Heart0.1 0.1 Skeletal muscle 25.8 1.1 Small intestine 5.1 0.3 Large intestine(colon) 0.6 0.3 Liver 0.3 0.1 Fetal liver 0.7 0.3 Pancreas 4.2 1.1Kidney 1.6 0.3 Adrenal gland 10.8 1.3 Thymus 4.8 0.2 Bone marrow 3.1 0.4Spleen 4.2 0.3 Testis 115.5 2.0 Prostate 14.6 1.5 Mammary gland 5.2 0.2Uterus 5.0 0.2 Placenta 1.4 0.4

(2) Expression Levels in Human Cancer Cell Lines

Quantitative real-time PCR as mentioned above was used for comparing themRNA expression levels of G34 in various cancer-derived human celllines. After cells of each human cell line were collected, RNA wasextracted with an RNeasy Mini Kit (QIAGEN) and converted into singlestrand DNA by the oligo(dT) method using a Super-Script First-StrandSynthesis System (Invitrogen Corporation). This DNA was used as atemplate and subjected to quantitative real-time PCR in an ABI PRISM7700 (Applied Biosystems Japan Ltd.) using a 5′-primer (SEQ ID NO: 14),a 3′-primer (SEQ ID NO: 15) and a TaqMan probe (SEQ ID NO: 16). PCR wasperformed under conditions of 50° C. for 2 minutes and 95° C. for 10minutes, and then under conditions of 50 cycles of 95° C. for 15 secondsand 60° C. for 1 minute.

As a result, the expression was observed in all the human cell lines(Table 7, FIG. 9).

TABLE 7 G34 mRNA expression levels in human cell lines Copy Copy numbernumber Cell (×10⁴/μg, (×10⁴/μg, line total RNA) Cell line total RNA)Neuro- SCCH-26 7.9 0.6 Esophageal ES1 23.0 2.5 blastoma NAGAI 19.5 1.5cancer ES2 16.1 0.6 NB-9 40.6 2.3 ES6 42.8 3.0 SK-N-SH 14.9 0.7 GastricMKN1 6.2 1.1 SK-N-MC 5.8 0.5 cancer MKN28 8.6 1.0 NB-1 20.9 0.5 MKN7 9.70.1 IMR32 21.0 0.2 MKN74 3.5 0.8 Glioma T98G 6.2 0.2 MKN-45 7.3 2.1YKG-1 3.9 0.0 HSC-43 42.8 1.7 A172 13.4 0.9 KATOIII 6.4 0.4 GI-1 13.71.3 TMK-1 10.8 1.2 U118MG 6.8 0.5 Large LSC 11.8 0.6 U251 28.9 1.9intestine LSB 4.9 0.3 KG-1-C 9.1 0.6 (colon) SW480 10.1 0.4 Lung cancerLu130 6.8 0.4 cancer SW1116 24.1 1.4 Lu134A 30.3 1.2 Colo201 10.4 0.4Lu134B 6.8 0.4 Colo205 6.8 0.9 Lu135 7.2 1.3 C1 21.9 1.2 Lu139 10.7 0.5WiDr 1.2 0.0 Lu140 15.4 1.8 HCT8 82.2 6.2 SBC-1 2.5 0.2 HCT15 12.1 1.0PC-7 9.1 0.2 Others A204 67.9 4.4 PC-9 22.4 0.1 A-431 30.6 2.5 HAL-815.2 1.2 SW1736 11.9 1.1 HAL-24 20.8 1.7 HepG2 2.3 0.3 ABC-1 10.3 0.9Capan-2 19.4 1.2 RERF-LC- 22.8 2.2 293T 55.1 8.3 MC PA-1 3.5 0.6 EHHA-920.3 7.9 Leukemia HL-60 2.1 0.1 PC-1 2.1 0.2 K-562 17.1 1.8 EBC-1 4.40.2 Lymphoma Daudi 2.4 0.2 PC-10 118.8 4.9 Namalwa 13.0 1.2 A549 27.12.6 KHM-IB 16.4 0.4 LX-1 30.7 2.1 Ramos 9.5 0.7 Raji 11.6 1.3 Jurkat42.7 1.9

(3) Expression Levels in Cancerous Tissues

Quantitative real-time PCR as mentioned above was used for comparing themRNA expression levels of G34 in cancer tissues and their surroundingnormal tissues derived from patients with large intestine (colon) cancerand lung cancer.

From cancer and normal tissues of the same patient, RNA was extractedwith an RNeasy Mini Kit (QIAGEN) and converted into single strand DNA bythe oligo(dT) method using a Super-Script First-Strand Synthesis System(Invitrogen Corporation). This DNA was used as a template and subjectedto quantitative real-time PCR in an ABI PRISM 7700 (Applied BiosystemsJapan Ltd.) using a 5′-primer (SEQ ID NO: 14), a 3′-primer (SEQ ID NO:15) and a TaqMan probe (SEQ ID NO: 16). PCR was performed underconditions of 50 cycles of 50° C. for 2 minutes, 95° C. for 10 minutes,95° C. for 15 seconds, and 60° C. for 1 minute. To correct variationsamong individuals, the resulting data were divided by the value ofβ-actin (internal standard gene) quantified using a kit of AppliedBiosystems Japan before being compared.

The results indicated that the mRNA expression level of the G34 gene wassignificantly increased in these cancerous tissues (Table 8, Table 9).

TABLE 8 G34 mRNA expression levels in tissues from large intestinecancer patients Patient Normal Standard Cancer Standard No. tissue errortissue error % Change  1 0.15 0.04 0.35 0.07 2.3  2 0.15 0.07 8.63 0.6558.0  3 0.07 0.02 1.55 0.15 23.5  4 0.08 0.05 1.82 0.26 22.0  5 0.080.02 0.60 0.07 7.2  6 1.04 0.08 1.92 0.21 1.8  7 0.07 0.02 5.37 1.0681.3  8 1.54 0.27 8.30 0.96 5.4  9 0.05 0.04 1.70 0.37 34.3 10 0.05 0.040.10 0.04 2.0 11 0.60 0.29 10.23 1.47 17.2 12 0.17 0.13 2.36 0.43 14.313 0.18 0.09 1.70 0.27 9.4 14 0.18 0.08 2.76 0.23 15.2 15 0.18 0.05 3.490.34 19.2 16 0.20 0.15 1.84 0.25 9.3 17 0.28 0.05 7.41 0.51 26.4 18 0.050.04 5.92 0.38 119.3 19 0.15 0.11 4.68 0.67 31.4 20 0.13 0.06 4.61 2.2234.9 21 0.02 0.02 8.40 1.65 508.0 22 0.20 0.07 3.57 0.43 18.0 23 0.550.27 2.33 1.23 4.3 Average 0.25 0.07 3.97 0.55 15.6 Copy number(×10000/μg, total RNA)

TABLE 9 G34 mRNA expression levels in tissues from lung cancer patientsPatient Normal Standard Cancer Standard No. tissue error tissue error %Change 1 0.48 0.06 2.03 0.27 4.2 3 0.00 0.00 0.55 0.21 — 4 2.43 0.406.13 0.17 2.5 5 0.10 0.04 2.74 0.32 27.7 6 1.69 0.28 3.11 0.69 1.8 70.60 0.16 2.76 0.35 4.6 8 2.30 0.38 6.23 0.21 2.7 9 1.26 0.27 2.51 0.102.0 10  1.47 0.18 4.76 0.57 3.2 11  0.64 0.00 1.14 0.11 1.8 12  0.560.06 0.69 0.04 1.2 13  1.32 0.02 1.98 0.15 1.5 14  0.17 0.02 0.66 0.024.0 15  0.71 0.05 2.71 0.13 3.8 16  1.07 0.13 15.64 1.11 14.6 17  1.030.12 8.27 0.73 8.1 18  0.13 0.02 1.95 0.09 14.8 Average 0.94 0.71 3.763.64 4.0 Copy number (×10000/μg, total RNA)

Example 4 Cloning and Expression of Mouse G34 Gene

The human G34 sequence obtained in Example 1 was used as a query for asearch against the mouse gene sequence serela (Applied Biosystems) tothereby find a corresponding nucleic acid sequence with high homology.The open reading frame (ORF) estimated from this nucleic acid sequenceis composed of 1515 bp (SEQ ID NO: 3), i.e., 504 amino acids (SEQ ID NO:4) when calculated as an amino acid sequence, and has a hydrophobicamino acid region characteristic of glycosyltransferases at itsN-terminal end. This sequence shares a homology of 86% (nucleic acidsequence) and 88% (amino acid sequence) with human G34 (SEQ ID NOs: 1and 2) (see FIG. 10). Moreover, the sequence retains all of the threemotifs conserved in the β3GalT family. The product encoded by thenucleic acid sequence of SEQ ID NO: 3 and the amino acid sequence of SEQID NO: 4 was designated mouse G34 (mG34).

To examine the activity of mG34, G34 was allowed to be expressed in amammalian cell line. In this example, the active region covering aminoacid 35 to the C-terminal end of mG34 was genetically introduced into amammalian cell line expression vector pFLAG-CMV3 using a FLAG ProteinExpression system (Sigma-Aldrich Corporation).

The expression in mouse tissues was confirmed by PCR. Each mouse tissue(brain., thymus, stomach, small intestine, large intestine (colon),liver, pancreas, spleen, kidney, testis or skeletal muscle) was used asa template and subjected to PCR using a 5′-primer (mG34-CMV-F1; SEQ IDNO: 17) and a 3′-primer (mG34-CMV-R1; SEQ ID NO: 18). PCR was performedunder conditions of 25 cycles of 98° C. for 10 seconds, 55° C. for 30seconds, and 72° C. for 2 minutes. The PCR product was electrophoresedon an agarose gel to confirm a band of approximately 1500 bp. As aresult, as shown in Table 10, the expression level was highest in thetestis, followed by spleen and skeletal muscle in the order named.

TABLE 10 mG34 mRNA expression levels in mouse tissues Tissue Expressionlevel Brain ± Thymus − Stomach + Small intestine − Large intestine +(colon) Liver + Pancreas − Spleen − Kidney ++ Testis +++ Skeletal muscle++

Mouse testis-derived cDNA was used as a template and subjected to PCRusing a 5′-primer (mG34-CMV-F1; SEQ ID NO: 17) and a 3′-primer(mG34-CMV-R1; SEQ ID NO: 18) to obtain a DNA fragment of interest. PCRwas performed under conditions of 25 cycles of 98° C. for 10 seconds,55° C. for 30 seconds, and 72° C. for 2 minutes. The PCR product wasthen electrophoresed on an agarose gel and isolated in a standard mannerafter gel excision. This PCR product has restriction enzyme sitesHindIII and NotI at the 5′ and 3′ sides, respectively.

After this DNA fragment and pFLAG-CMV3 were each treated withrestriction enzymes HindIII and NotI, the reaction solutions were mixedtogether and subjected to ligation reaction, so that the DNA fragmentwas introduced into pFLAG-CMV3. The reaction solution was purified byethanol precipitation and then mixed with competent cells (E. coliDH5α). After heat shock treatment (42° C., 30 seconds), the cells wereseeded on ampicillin-containing LB agar medium.

On the next day, the resulting colonies were confirmed by direct PCR forthe DNA of interest. For more reliable results, after sequencing toconfirm the DNA sequence, the vector (pFLAG-CMV3-mG34A) was extractedand purified.

Human kidney cell-derived cell line 293T cells (2×10⁶) were suspended in10 ml antibiotic-free DMEM medium (Invitrogen Corporation) supplementedwith 10% fetal bovine serum, seeded in a 10 cm dish and cultured for 16hours at 37° C. in a CO₂ incubator. pFLAG-CMV3-mG34A (20 ng) andLipofectamin 2000 (30 μl, Invitrogen Corporation) were each mixed with1.5 ml OPTI-MEM (Invitrogen Corporation) and incubated at roomtemperature for 5 minutes. These two solutions were further mixed gentlyand incubated at room temperature for 20 minutes. This mixed solutionwas added dropwise to the dish and cultured for 48 hours at 37° C. in aCO₂ incubator.

The supernatant (10 ml) was mixed with NaN₃ (0.05%), NaCl (150 mM),CaCl₂ (2 mM) and anti-M1 resin (100 μl, SIGMA), followed by overnightstirring at 4° C. On the next day, the supernatant was centrifuged (3000rpm, 5 minutes, 4° C.) to collect a pellet fraction. After addition of 2mM CaCl₂-TBS (900 μl), centrifugation was repeated (2000 rpm, 5 minutes,4° C.) and the resulting pellet was suspended in 200 μl of 1 mMCaCl₂-TBS for use as a sample for activity measurement (mouse G34 enzymesolution). A part of this sample was electrophoresed by SDS-PAGE andWestern blotted using anti-FLAG M2-peroxidase (SIGMA) to confirm theexpression of the mG34 protein of interest. As a result, a band wasdetected at a position of about 60 kDa, thus confirming the expressionof the mG34 protein.

Example 5 Search for Glycosyltransferase Activity of Mouse G34

The following reaction system was used for examining mouse G34 for itssubstrate specificity in its β1,3-N-acetylgalactosamine transferaseactivity. In the reaction solutions shown below, each of the followingwas used at 10 nmol as an “acceptor substrate”: pNp-α-Gal, oNp-β-Gal,Bz-α-GlcNAc, Bz-β-GlcNAc, Bz-α-GalNAc, pNp-β-GalNAc, pNp-α-Glc,pNp-β-Glc, pNp-β-GlcA, pNp-α-Fuc, pNp-α-Xyl, pNp-β-Xyl, pNp-α-Man,lactoside-Bz, Lac-ceramide, Gal-ceramide, Gb3, globoside,Gal-β1-4GalNAc-α-pNp, Galβ1-3GlcNAc-β-Bz, GlcNAc-β1-4-GlcNAc-β-Bz,core1-pNp, core2-pNp, core3-pNp and core6-pNp (all purchased fromSIGMA).

Each reaction solution was prepared as follows (final concentrations inparentheses): each substrate (10 nmol), HEPES(N-[2-hydroxyethyl]piperazine-N′-[2-ethanesulfonic acid]) (pH 7.4, 14mM), MnCl₂ (10 mM), Triton CF-54 (trade name) (0.3%), UDP-GalNAc (2 mM)and UDP-[¹⁴C]GlcNAC (40 nCi) were mixed and supplemented with 5 μl mouseG34 enzyme solution, followed by dilution with H₂O to a total volume of20 μl.

The above reaction mixtures were each reacted at 37° C. for 16 hours.After completion of the reaction, 200 μl H₂O was added and each mixturewas lightly centrifuged to obtain the supernatant. The supernatant waspassed through a Sep-Pak plus C18 Cartridge (Waters), which had beenwashed once with 1 ml methanol and twice with 1 ml H₂O and thenequilibrated, to allow the substrate and product in the supernatant toadsorb to the cartridge. After washing the cartridge twice with 1 mlH₂O, the adsorbed substrate and product were eluted with 1 ml methanol.The eluate was mixed with 5 ml liquid scintillator ACSII (AmershamBiosciences) and measured for the amount of radiation with ascintillation counter (Beckman Coulter).

The results thus measured were compared assuming that the radioactivityobtained using Bz-β-GlcNAc as a substrate was set to 100% (Table 11).When used as a substrate, Bz-β-GlcNAc showed the largest increase inradioactivity. core2-pNp, core6-pNp, core3-pNp, pNp-β-Glc andGlcNAc-β1-4-GlcNAc-β-Bz also showed high radioactivity in the ordernamed. The other substrates showed no increase in radioactivity.

TABLE 11 Acceptor substrate % pNp-α-Gal ND oNp-β-Gal ND Bz-α-GlcNAc NDBz-β-GlcNAc 100 Bz-α-GalNAc ND pNp-β-GalNAc ND pNp-α-Glc ND pNp-β-Glc 12pNp-β-GlcA ND pNp-α-Fuc ND pNp-α-Xyl ND pNp-β-Xyl ND pNp-α-Man NDLactoside-Bz ND Lac-ceramide ND Gal-ceramide ND Gb3 ND Globoside NDGalβ1-4GalNAc-α-pNp ND Galβ1-3GlcNAc-β-pNp ND GlcNAcβ1-4GlcNAc-β-Bz 10core1-pNp ND core2-pNp 25 core3-pNp 14 core6-pNp 18

Example 6 In Situ Hybridization on Mouse Testis

In situ hybridization using mG34 was performed on a mouse testis-derivedsample to confirm the expression of mG34 in the mouse testis sample (seeFIG. 11).

Example 7 Creation of G34 Knockout Mouse

A targeting vector (pBSK-mG34-KOneo) is constructed in which pBluescriptII SK(−) (TOYOBO) is inserted with a chromosomal fragment (about 10 kb)primarily composed of an approximately 10 kb fragment covering exons(i.e., Exons 3 to 12 (1242 bp) within the ORF region of mG34) containingactivation domains of the gene (mG34) to be knocked out. pBSK-mG34-KOneois also designed to have the drug resistance gene neo (neomycinresistance gene) introduced into Exons 7 to 9 which are putative GalNActransferase active regions of mG34. As a result, Exons 7 to 9 of mG34are deleted and replaced by neo. The pBSK-mG34-KOneo thus obtained islinearized with a restriction enzyme NotI, 80 μg of which is thentransfected (e.g., by electroporation) into ES cells (derived fromE14/129Sv mice) to select G418-resistant colonies. The G418-resistantcolonies are transferred to 24-well plates and then cultured. After apart of the cells are frozen and stored, DNA is extracted from theremaining ES cells and around 120 colonies of recombinant clones areselected by PCR. Further, Southern blotting or other techniques areperformed to confirm whether recombination occurs as expected, finallyselecting around 10 clones of recombinants. ES cells from two of theselected clones are injected into C57BL/6 mouse blastocysts. The mouseembryos injected with the ES cells are transplanted into the uteri ofrecipient mice to generate chimeric mice, followed by germlinetransmission to obtain heterozygous knockout mice.

1. A method for producing a β1,3-N-acetyl-D-galactosamine transferaseprotein, which comprises growing a transformant containing a vectorcarrying a nucleic acid to express the protein and collecting theprotein from the transformant, wherein the nucleic acid is selected fromthe group consisting of the following i) to vi): i) a nucleic acidconsisting of a nucleotide sequence encoding a polypeptide having anamino acid sequence shown in SEQ ID NO: 2 or 4; or a polypeptide havingan amino acid sequence with substitution, deletion or insertion of oneor more amino acids in the amino acid sequence shown in SEQ ID NO: 2 or4 and which transfers N-acetyl-D-galactosamine to N-acetyl-D-glucosaminewith β1,3 linkage, or a nucleotide sequence complementary thereto; ii) anucleic acid consisting of the nucleotide sequence shown in SEQ ID NO: 1or 3 or a nucleotide sequence complementary to at least one of them;iii) a nucleic acid consisting of a nucleotide sequence coveringnucleotides 565 to 1503 shown in SEQ ID NO: 1 or a nucleotide sequencecomplementary thereto; iv) a nucleic acid consisting of a nucleotidesequence covering nucleotides 106 to 1503 shown in SEQ ID NO: 1 or anucleotide sequence complementary thereto; v) a nucleic acid consistingof a nucleotide sequence covering nucleotides 103 to 1512 shown in SEQID NO: 3 or a nucleotide sequence complementary thereto; and vi) anucleic acid according to any one of i) to v), which is DNA.
 2. A methodfor transferring N-acetyl-D-galactosamine to N-acetyl-D-glucosamine withβ-1,3 linkage, using the β1,3-N-acetyl-D-galactosamine transferaseprotein produced by the method according to claim 1.